Abstract
We can now collect cell-count data across whole animal brains quantifying recent neuronal activity, gene expression, or anatomical connectivity. This is a powerful approach since it is a multi-region measurement, but because the imaging is done post-mortem, each animal only provides one set of counts. Experiments are expensive and since cells are counted by imaging and aligning a large number of brain sections, they are time-intensive. The resulting datasets tend to be under-sampled with fewer animals than brain regions. As a consequence, these data are a challenge for traditional statistical approaches. We demonstrate that hierarchical Bayesian methods are well suited to these data by presenting a ‘standard’ partially-pooled Bayesian model for multi-region cell-count data and applying it to two example datasets. For both datasets the Bayesian model outperformed standard parallel t-tests. Overall, the Bayesian approach’s ability to capture nested data and its rigorous handling of uncertainty in under-sampled data can substantially improve inference for cell-count data.
Significance Statement
Cell-count data is important for studying neuronal activation and gene expression relating to the complex processes in the brain. However, the difficulty and expense of data collection means that such datasets often have small sample sizes. Many routine analyses are not well-suited, especially if there is high variability among animals and surprising outliers in the data. Here we describe a multilevel, mixed effects Bayesian model for these data and show that the Bayesian approach improves inferences compared to the usual approach for two different cell-count datasets with different data characteristics.
Introduction
In studying the brain we are often confronted with phenomena that involve specific subsets of neurons distributed across many brain regions. Computations, for example, are performed by neuronal networks connecting cells in different parts of the brain and, in development, neurons in different anatomical regions of the brain share the same lineage. An example of each of these types will be considered here, but the challenge is very general: how to measure and analyze multi-region neuronal data with cellular resolution.
In a typical cell-count experiment, gene expression is used to tag the specific cells of interest with a targeted indicator (1). The brain is sliced, an entire stack of brain sections from a single animal is imaged, aligning and registering the images to a standardised brain atlas such as the Allen mouse atlas (2–5), segmenting the images into anatomical regions and counting the labeled cells in each region. The resulting dataset consists of labeled cell counts across each of ∼10–100 brain regions. This technology is being deployed to address questions in a broad range of neuroscience subfields, for example: memory (6–8), neurodegenerative disorders (9), social behavior (10), and stress (11).
Cell-counts are often compared across groups of animals which differ by an experimental condition such as drug treatment, genotype or behavioural manipulation. However, the expense and difficulty of the experiment means that the number of animals in each group is often small, ten is a typical number but fewer than ten is not uncommon. This means that these data are undersampled: the dimensionality of the data, which corresponds to the number of brain regions, is much larger than the number of samples, which usually corresponds to the number of animals (Figure 1A). Current statistical methods are not well-suited to these nested wide-but-shallow datasets. Furthermore, due to the complicated preparation and imaging procedure, there is often missing data and there is variability derived from experimental artifacts.
In cell count data there are two obvious sources of noise. The first of these is easy to describe, if a region has an rate that determines how likely a cell is to be marked for counting, then the actual number of marked cells is sampled from a Poisson distribution. The second source of noise is the animal to animal variability of the rate itself and this depends on diverse features of the individual animal and the experiment that are often unrelated to the phenomenon of interest. The challenge is to control for outliers and ‘poor’ data points whose rate is noisy, while extracting as much information as possible about the underlying process. Dealing with outliers is often an opaque and ad hoc procedure, it is also a binary decision, a point is either excluded so it does not contribute, or included, noise and all. This is where partial pooling helps. Partial pooling allows for the simultaneous estimation of individual parameters with the population distribution that describes them. This helps the data to self-regularise and elegantly balances the contribution of informative and weak observations to parameter values.
In recent years Bayesian approaches to data analysis have become powerful alternatives to classical frequentist approaches (12–14), including for some types of neuroscience data: such as neurolinguistics (15), neural coding (16, 17), synaptic parameters (18–21) and neuronal-circuit connectivity (22, 23). A Bayesian approach is particularly well suited to cell-count data but has not previously been applied to this problem.
A Bayesian approach formalizes the process of scientific inference, it distinguishes the data and a probabilistic mathematical model of the data. This model has a likelihood which gives the probability of the observed data for a given set of model parameters. The model often has a hierarchical structure which we compose to reflect the structure of the experiment and the investigators hypothesis of how the data depends on experimental condition. This hierarchy determines a set of a priori probabilities for the parameter values. The result of Bayesian inference is a probability distribution for these model parameters given the data, termed the posterior.
There are three advantages of a Bayesian approach that we want to emphasis, (1) while traditional multi-level models also allow a hierarchy (24), Bayesian models are more flexible and the role of the model is clearer, (2) since the result of Bayesian inference is a probability distribution over model parameters, it indicates not just the fitted value of a parameter but the uncertainty of the parameter value. Finally (3) Bayesian models tend to make more efficient use of data and therefore improving statistical power.
Here, our aim is to introduce a ‘standard’ Bayesian model for cell count data. We illustrate the application of this model to two datasets, one related to neural activation and the other to developmental lineage. In both cases the Bayesian model produces clearer results than the classical frequentist approach.
Materials and Methods
Data
To illustrate our approach we consider two example applications, one which counts cells active in regions of the recognition memory circuit of rat during a familiarity discrimination task, and the other which examines the distribution of a specific interneuron type in the mouse thalamus.
Case study one – transient neural activity in the recognition memory circuit
The recognition memory network, Figure 3, is a distributed network which has been well studied using a variety of behavioural tasks. It includes the hippocampus (HPC) and perirhinal cortex (PRH), shown to deal with object spatial recognition and familiarity discrimination respectively (25–27); medial prefrontal cortex (MPFC), concerned with executive functions such as decision making but also with working memory, and the temporal association cortex (TE2) used for acquisition and retrieval of long-term object-recognition memories (28). The nucleus reuniens (NRe) is also believed to be an important component of the circuit (25, 29), owing to its reciprocal connectivity with both MPFC and HPC (30). In previous studies, lesions of the NRe have been shown to significantly impair long-term but not short-term object-in-place recognition memory (29). The data analysed in this case study were collected to investigate the role of the NRe in the recognition memory circuit through contrasting the neural activation for animals with a lesion in the NRe with neural activation for animals with a sham surgery. The immediate early gene c-Fos is rapidly expressed following strong neural activation and is useful as a marker of transient neural activity. Animals in the experiment performed a familiarity discrimination task (single-item recognition memory), discriminating novel or familiar objects with or without an NRe lesion and the number of cells that expressed c-Fos was counted in regions across the recognition memory circuit. The two-by-two experimental design allocated ten animals to each of the four experiment groups {sham, lesion} × {novel, familiar} and cell counts were recorded from a total of 23 brain regions. The visual cortex V2C and the motor cortex M2C were taken as control regions as they were not expected to show differential c-Fos expression in response to novel or familiar objects (31).
Case study two – Ontogeny of inhibitory neurons in mouse thalamus and hypothalamus
The second dataset comes from a study, (32), that, in part, counted the number of inhibitory interneurons in the thalamocortical regions of mouse. Sox14 is a gene associated with inhibitory neurons in subcortical areas. It is required for the development and migration of local inhibitory interneurons in the dorsal lateral geniculate nucleus (LGd) of the thalamus (32, 33). Consequently, Sox14 is useful for identifying discrete neuronal populations in the thalamus and hypothalamus (or in the diencephalon) (32–34). A heterozygous (HET) knock-in mouse line Sox14GFP/+ to mark Sox14 expressing neurons with green fluorescent protein (GFP), was compared to a homozygous knock-out (KO) mouse line Sox14GFP/GFP engineered to block expression of the endogenous Sox14 coding sequence (35). Each animal produced two samples, one for each hemisphere giving a total of ten data points, six belonging to HET (three animals), and four to KO (two animals). Each observation is 50 dimensional, corresponding to fifty individual brain regions in each hemisphere.
Hierarchical modeling
Our goal in both cases is to quantify group differences in the data. We start with a ‘standard’ hierarchical model; this model can be further refined or changed to improve the analysis and to more clearly match the structure of a particular experiment, we will give an example of this for our second dataset. At the bottom of the model are the data themselves, the cell counts yi. The index i runs over the full set of samples, in this case 23 × 10 × 4 datapoints in the first study, and 50 × 6 + 50 × 4 in the second. The basic assumption the model makes is that this count is derived from an underlying propensity, λi > 0, which depends on brain region and, potentially, group:
Hence, in this case, the propensity λi is the mean of the Poisson distribution, and the statistical model describes the dependence of this parameter on brain region and animal. Since λi is strictly positive a log-link function is used:
where
and we have used ‘array notation’ (36), mapping the sample index i to properties of the sample, so r[i] returns the region index of observation i, and similar for g[i] but for groups and animals.
We use partial pooling. Thus, ignoring Ei for now, the rate has been split between two terms, θr[i],g[i] is the fixed effect which is constant across animals and γr[i],a[i] is the random effect which captures the animal to animal variability. While θrg models the mean for log-cell-count for each region, given the condition; γi models variation around this mean. For this reason, γi is assumed to follow a normal distribution with zero mean. The regression term may appear over-parameterized, without θrg the γi could ‘do the work’ of matching the data. However, the model is regularized by a prior; observations with a weak likelihood will have their random effect γi shrunk towards the population location. The amount of regularization depends on the variation in the population, a quantity that is estimated from each likelihood. This is how partial pooling works as an adaptive prior for ‘similar’ parameters (Figure 1B). The data ‘pools’ some evidence while still allowing for individual differences in sample.
The final term is the exposure Ei. Cell counts may be recorded from sections with different areas. The exposure term scales the parameters in the linear model as the recording area increases (13). In our model, the exposure is equal to the logarithm of the recording area; this value is available as part of the experimental data.
The set of parameters τr,g models the population standard deviations of the noise for each region r and animal group g.
When working on the log scale, priors for these parameters are typically derived in terms of multiplicative increases. Since the parameters are positive they are assigned a half-normal distribution
with an appropriately chosen scale s > 1. For our analyses we used s = 1.05 because this gives a HalfNormal distribution with 95% of its density in the interval [0, log(1.1)]. This translates into an approximate 10% variation around exp(θ) at the upper end, which is a moderately informative prior, reflecting our belief that within-group animal variability is small relative to between-group variability. This regularization also helps model inference when the datasets are undersampled.
Horseshoe prior
Cell count data often has outliers, for example due to experimental artifacts. Since by default the likelihood does not account for these outliers, they may cause substantial changes in fitted parameter values. This is demonstrated in Figure 2, where a careless application of the Poisson distribution on data with several zero counts has a large influence on the posterior distribution. There are two general options for dealing with outliers: either modeling them in the likelihood or in the prior. Although the likelihood option is preferred as it is more direct, see our zero-inflation model below, it can be hard to design because it requires knowledge of the outlier generation process. The alternative is via a flexible prior such as the horseshoe (37, 38). This more generic option may be suitable as a default approach if the outliers are poorly understood.
The horseshoe prior is a hierarchical prior for sparsity. It introduces an auxiliary parameter κi that multiplies the population scale τ. This construction allows surprising observations far from the bulk of the population density to escape regularisation.
An example of this is given in Figure 2 as the bottom left cell of the 2 × 2 table of models. The horseshoe prior often uses a Cauchy distribution but in our case the heavy tail causes problems for the sampling algorithm, see SI section 8.
Zero inflation
A particular trait of the second dataset is that there are a large number of zero data points (∼6%). Although a zero observation is always possible for a Poisson distribution, for plausible values of the propensity, zeros should be rare. It is likely that for some regions the experiment has not worked as expected and the zeros show that something has ‘gone wrong’ and that the readings are not well described by a Poisson distribution. Here we extend the model to include this possibility. This is a useful elaboration of the ‘standard model’, in the standard model the horseshoe prior ensures that these anomalous readings only have a small effect on the result but it is more informative to extend the model to include them. While this particular extension is specific to these data, it also serves as an example of how a standard Bayesian model can serve as a starting for point for an investigation of the data.
The zero-inflated Poisson model is intended to model a situation were there are zeros unrelated to the Poisson distribution, in this case this might, for example, be the result of an error in the automated registration process that identifies regions and counts their cells. It is a mixture model, if
there is a probability π that yi = 0 and a 1−π probability that yi follows a Poisson distribution with rate λi. Importantly this means there are two ways in which yi can be zero, through the Bernoulli process parameterized by π, or through the Poisson distribution. This has the effect of ‘inflating’ the probability mass at zero with the additional parameter π giving the proportion of extra zeros in the data that could not be explained by the standard Poisson distribution. This distribution can be visualized in Figure 2 and further mathematical details are described in SI 2C.
Model inference
Posterior inference was performed with the probabilistic programming language Stan (39), using its custom implementation of the No-U-Turn (NUTS) sampler (40, 41). For each model, the posterior was sampled using four chains for 8000 iterations, with half of these being attributed to the warm-up phase; giving a total of 16000 samples from the posterior distribution.
Results
We describe differences in estimated counts between groups in terms of log2-fold changes. Fold changes are useful because they prevent differences that are small in absolute magnitude from being masked by regions with high overall expression. Our results compare Bayesian highest density intervals (HDI) with the confidence interval (CI) from an uncorrected Welch’s t-test. The Bayesian HDI is calculated from the posterior distribution and is the smallest width interval that includes a chosen probability, here 0.95 (to correspond to α = 0.05), and summarizes the meaningful uncertainty over a parameter of interest.
Case study one – transient neural activity in the recognition memory circuit
Results for the first dataset are presented in Figure 4 with Figure 4A plotting cell-count differences between the novel and familiar conditions without lesion and Figure 4B with lesion. These data were collected to investigate the role of different hippocampal and adjacent cortical regions in memory. However, some regions of interest, such as the intermediate dendate gyrus (IDG), and the dorsal subiculum (DSUB) look under-powered: in the sham animals, for both regions there is a markedly non-zero difference in expression between the novel and familiar conditions but a wide confidence interval overlapping zero makes the evidence unreliable (orange bars Figure 4A).
In contrast, the Bayesian estimates (green bars Figure 4) produce a clear result. For a number of brain regions in Figure 4A, sham-novel animals have higher expression than sham-familiar ones. These differences disappear in Figure 4B with lesion-novel and lesion-familiar animals showing roughly equal cell counts. This indicates that the difference is only present when the NRe is intact.
Case study two – Ontogeny of inhibitory interneurons of the mouse thalamus
The estimated log2-fold difference in GFP expressing cells between the two genotypes, for each of the fifty brain regions, are plotted in Figure 5. This include the purple and pink 95% HDI from the horseshoe and zero-inflated Poisson models along with the 95% confidence interval arising from a t-test in orange.
The use of the zero-inflated Poisson distribution uncovers some interesting differences when compared to the t-test confidence intervals. One example of this is the result for tuberomammillary nucleus, dorsal part (TMd). For this region the HET animals have high GFP expression across both hemispheres, yet animal three has a reading of zero for both hemispheres. This injects variability into the standard deviation of the HET group. Consequently, the pooled standard deviation used in the t-test is large and almost certainly guarantees a non-significant result. Furthermore, the sample mean of this region looks nothing like zero, but also nothing like the other two animals with positive counts. In addition to the wide interval, the sign of the difference does not agree with the data. Similarly, the medial preoptic nucleus (MPN) also suffers from poor estimation. Once again, this region contains a single HET animal for which the reading from both hemispheres is zero. The zero-inflated Poisson produces a posterior distribution of the appropriate sign with small uncertainty.
Despite the large difference in interval estimation between the HDI and CI for many brain regions, as the data becomes stronger from the perspective of the frequentist p-value towards the right-hand side of the second row in Figure 5, they become much more compatible. The variation within groups is very small for these regions; further regularization is not necessary and so the impact of partial pooling has been reduced. The sample estimate of the t-test has ‘caught up’ to the regularized estimate because the signal is strong.
Discussion
We have presented a standard workflow for Bayesian analysis of multi-region cell-count data. We propose a likelihood and appropriate priors with a nested hierarchical structure reflecting the structure of the experiment. We applied this to two distinct example datasets and demonstrate that they capture more fruitfully the characteristics of the data when compared to field-standard frequentist analyses.
For both case studies, the Bayesian uncertainty intervals are more precise than the confidence intervals. These confidence intervals tend to be quite wide on these data because of the small sample size and because of violations to their parametric model assumptions. Our workflow uses a horseshoe prior, along with the partial pooling, this allows our model to deal effectively with outliers. Furthermore, for the data sizes presented here, a full Bayesian inference using Stan does not require long computation time, or even particularly high performance hardware. Modern multi-core laptop processors are quite sufficient for this task. Fitting a model typically takes less than an hour.
The workflow we have exhibited here is intended as a standard approach. We believe that, without extension, it will provide a robust model for cell-count data. However, we also suggest that the standard workflow can be a useful first step for a more comprehensive, extended, model when one is required. We have given an example of this for the second dataset where the anomalous zeros prompted us to change the likelihood to a zero-inflated Poisson. There are other possibilities, for example, zero inflation is not the only way to handle an anomaly in the number of zeros: the hurdle model is an alternative, (42). This is not a mixture model, instead it restricts the probability of zeros to some value π with the probabilities for the positive counts coming from a truncated Poisson distribution. The hurdle mode can deflate as well as inflate the probability mass at zero. This did not match the situation in the data we considered but might for other datasets. Another extension might involve tighter priors based on previous experiments. This is likely to be very relevant for cell-count data since these experiments are rarely performed in isolation and so prior information can be leveraged from a history of empirical results.
One obvious elaboration of our model would replace normal distributions with multivariate normal distributions. This would have two advantages. Firstly, correlations are difficult to estimate for under-sampled data. Including correlation matrix priors provides extra information, for example, based on anatomical connectivity, that can aid the statistical estimation of other parameters. Secondly, it would more closely match our understanding of the experiment: we know that activity is likely to be correlated across regions and so it is apposite to include that directly in the model. Unfortunately the problem of finding a suitable prior for the correlation proved insurmountable: the standard Lewandowski-Kurowicka-Joe distribution (43) which has been useful in lower-dimensional situations is too regularising here. This is an area where further work needs to be done.
It is important to highlight that a mixed effects model is not a uniquely Bayesian construction. Indeed, any model that tries to include more sophistication through hierarchical structures, Bayesian or otherwise is useful. However, non-Bayesian models can be complicated and opaque, they are also often more restrictive, for example they often assume normal distributions, and circumventing these restrictions can make the models even less transparent. A Bayesian approach is, at first, unfamiliar; this can make it seem more obscure than better established methods, but, in the long run, Bayesian models are typically clearer and do not involve so many different assumptions and so many fine adjustments.
Code and data availability
The code necessary to run the models presented in this manuscript can be found at our Github respository: https://github.com/SydneyJake/Hierarchical Bayesian Cell Counts. The data for case study one on nucleus reunion lesion are available from https://doi.org/10.5281/zenodo.12787211 (44). The data from case study two on Sox14 expressing neurons are available from https://doi.org/10.5281/zenodo.12787287 (45).
Acknowledgements
We are grateful to Andrew Dowsey and Matthew Nolan for useful discussion and helpful suggestions. We would like to acknowledge funding from the EPSRC (EP/R513179/1 Doctoral Training Partnership PhD studentship to SD and EP/W024020/1 to SS), BBSRC (BB/L02134X/1 to ECW and BB/R007020/1 to AD), Wellcome Trust (206401/Z/17/Z to ECW), Leverhulme Trust (RF-2021-533 to CJH), and MRC (MR/S026630/1 to COD)
Appendix
1. Full model
Here we present the complete mathematical model for each of of the three models applied in the main text. In all cases, the exposure term is only necessary for the first case study. Area was not available for the second so the exposure term was left out.
A. Poisson model
B. Horseshoe model
C. Zero-inflated model
2. Distributions
A. Zero inflated Poisson distribution
The zero inflated Poisson distribution is a mixture distribution with mixing parameter π. The distribution is formally defined below. If,
then
where
Equivalently, the mixture can be specified with an indicator function.
B. Half Normal distribution
The HalfNormal distribution coincides with a zero-mean normal distribution truncated at zero. It has a single scale parameter σ. If,
then
is the probability density function.
3. Additional methods
A. Fold differences
In our results we present differences between experimental groups in terms of log2-fold differences. We calculate this as follows. The parameter of interest θr,g=i is modeled on the natural log scale owing to the log-link function necessary for the Poisson regression. At the average animal the difference,
is the natural log of the ratio of the expected counts. To obtain log2-fold differences we simply change the base by multiplying log Δij by log2(e).
B. Data transformations
To facilitate a comparison with the Bayesian intervals in terms of log2-fold differences, it was necessary to add one to any zero counts before applying the t-test.
C. Non-centered parameterization
Hierarchical models can produce geometry that is difficult for the sampler to explore. Fortunately, there exists a simple reparameterisation known as non-centering that can remedy this problem. In our model, instead of sampling γi directly, we sample the parameter instead, and use it to reconstruct γi. That is, sample from a standard normal distribution,
and reconstruct γi as a deterministic function of sampled values of and τr[i],g[i].
This removes the frustrating joint behaviour between of (γi, τr[i],g[i]), and promotes efficient sampling.
D. Preprocessing – case study 1
In these data some animals produced more than on reading per brain region. Before fitting our model these were summed together to produce a single count; the exposure term was also properly adjusted to correctly reflect the area of the recording site.
4. Software, packages and libraries
R version 4.2.1–“Funny-looking-kid”.
Computation was performed locally on a Dell XPS 13 7390 laptop (Intel i7-10510U @ 1.80 GHz, 16 GB of RAM; Ubuntu 20.04.4 LTS)
Panels composed using inkscape version 1.2.2.
5. Glossary
Here we include a glossary of terms for the brain regions in each case study.
6. Sampler diagnostics
The basic Poisson model (Equation 1) was sampled excellently. Measures of sampling performance such as (46, 47), and effective sample size, were all satisfactory (SI figure 1). Similarly, for the zero-inflated model (Equation 7), no problems were observed for any of the diagnostics (SI figure 2). Contrasting with this, the horseshoe model (Equation 6), exhibited some signs of fitting problems (SI figure 3). Divergences were not observed, and given the longer chain length this is reassuring evidence against biased computation. However for many parameters the effective sample size is much lower than we would like to see. This is reflected in the trace plots: the sampler is not making large jumps across the parameter space implying high auto-correlation and low effective sample size. Unfortunately, the horseshoe is notoriously hard to fit and we resort to brute-force methods such as increasing the number of iteration and reducing the step size of the sampler to improve the inference. In the following three plots diagnostics have been summarised with the following three items:
A: The performance of the sampler is illustrated by plotting (R-hat, ideal) against the ratio of the effective number of samples (larger is better) for each parameter in the model. Points represent individual parameters in the model, and have further been colour coded by their type. For example, all θr,g are coloured in green. Points have also been scaled based on how numerous the parameters are so the more numerous parameters have smaller dots, the less numerous, larger.
B: A histogram comparing the marginal energy distribution πE, and the transitional energy distribution πΔE of the Hamiltonian. Ideally, these distributions should match each other closely if the posterior distribution has been properly explored by the sampler.
C: For each parameter type the parameter with the ‘poorest’ mixing (largest ) are presented with a post-warmup trace plot that overlays the ordered sequence of samples from each of the four chains. Corresponding points in A are marked with a black border and zero transparency.
7. Posterior predictive checking
Posterior predictive checks use the posterior predictive distribution,
where p(θ|y) is the posterior distribution, and p(yrep |θ) is the data distribution for yrep that follows the same form as the likelihood for y. If we can verify that the posterior predictive distribution can generate replicate datasets with similar statistics to the observed data, then we might conclude that our model is consistent with the observed data and useful for answering questions about it. In practice a Monte-carlo approach is used to approximate statistics of the posterior predictive distribution. For example, if T is the test statistic of interest such as the sample mean, then
For 1, …, S.
sample θs ∼ p(θ|y)
sample
calculate , where T is the statistic of interest
return
The posterior predictive checks that follow examine two important statistics of count data. 1) The standard deviation of the data (dispersion), as figure a. 2) The proportion of zeroes in the data (zero-inflation), as figure B. The value of the test statistic applied to the observed data T (y) is plotted as a solid purple line, and the distribution of test statistics as a purple histogram.
8. Horseshoe densities
In our model a horseshoe prior was used to allow some γi, typically those informed by yi = 0, to escape regularisation by partial pooling. However, we encountered many problems with the default parameterisation that assigns a HalfCauchy density to the individual inflation parameters κi. In Figure 13A, the proportional conditional posterior density
where θ and τ have been fixed, is plotted when yi lies close to the population mean (left), or when it equals zero (right). Note that the x-axis is and not γi because the model samples a non-centered parameterisation. That is,
captures the deviations of γi around the population mean θ.
Figure 13B plots samples from the marginal posterior , when fitting to the data y = {770, 820, 713, 541, 0, 0}. Each data point corresponds to one of the six plots in Figure 13B. This small example dataset is in fact the cell-counts for region TMd recorded from the heterozygote group in case study 2. The similarities in geometry can be readily seen with Figure 13A. A large number of divergences were produced by the sampler as demonstrated by the number of pink points compared to the non-divergent transitions in blue. For fixed τ the values can take increase or decrease with smaller or larger κ respectively. The HalfCauchy places an extremely long right tail over κ that frustrates this relationship resulting in a posterior density that is difficult to sample from.
Modified horseshoe
For a Poisson model with parameters modeled on the log scale we consider the Cauchy parameterisation to be too extreme. In light of this we opted for a pragmatic approach, a modification to the original horseshoe by replacing the HalfCauchy distribution with a HalfNormal distribution. The modified horseshoe cuts off the top of the funnels by restricting κ to produce pleasant posterior geometry Figure 13A, B. The modified horseshoe is much easier to sample from, but with the cost of a much more constraining prior over γi.
References
- 1.A new era for functional labeling of neurons: activity-dependent promoters have come of ageFront. Neural Circuits 8
- 2.Genome-wide atlas of gene expression in the adult mouse brainNature 445:168–176
- 3.A mesoscale connectome of the mouse brainNature 508:207–214
- 4.A suite of transgenic driver and reporter mouse lines with enhanced brain-cell-type targeting and functionalityCell 174:465–480
- 5.Hierarchical organization of cortical and thalamic connectivityNature 575:195–202
- 6.Encoding of discriminative fear memory by input-specific ltp in the amygdalaNeuron 95:1129–1146
- 7.Network-level changes in the brain underlie fear memory strengtheLife 12
- 8.Hippocampal engrams generate variable behavioral responses and brain-wide network statesJ. Neurosci 44
- 9.Three-dimensional study of alzheimer’s disease hallmarks using the idisco clearing methodCell Reports 16:1138–1152
- 10.Mapping social behavior-induced brain activation at cellular resolution in the mouseCell Reports 10:292–305
- 11.The mouse brain after foot shock in four dimensions: Temporal dynamics at a single-cell resolutionProc. Natl. Acad. Sci 119
- 12.Bayesian data analysisCRC press
- 13.Statistical rethinking: A Bayesian course with examples in R and StanChapman and Hall/CRC
- 14.Bayesian statistics and modellingNat. Rev. Methods Primers 1:1–26
- 15.Bayesian analysis of phase data in EEG and MEGeLife 12
- 16.A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cellsJ. Neurosci 18:7411–7425
- 17.Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cellsJ. Neurophysiol 79:1017–1044
- 18.Bayesian estimation of synaptic physiology from the spectral responses of neural massesNeuroimage 42:272–284
- 19.Probabilistic inference of short-term synaptic plasticity in neocortical microcircuitsFront. Comput. Neurosci 7
- 20.Bayesian inference of synaptic quantal parameters from correlated vesicle releaseFront. Comput. Neurosci 10
- 21.Model-based inference of synaptic transmissionFront. Synaptic Neurosci 11
- 22.A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging dataThe Annals Appl. Stat :1229–1261
- 23.Bayesian mapping of the striatal microcircuit reveals robust asymmetries in the probabilities and distances of connectionsJ. Neurosci 42:1417–1435
- 24.A solution to dependency: using multilevel analysis to accommodate nested dataNat. Neurosci 17:491–496
- 25.When is the hippocampus involved in recognition memoryJ. Neurosci 31:10721–10731
- 26.Neurotoxic lesions of the perirhinal cortex do not mimic the behavioural effects of fornix transection in the ratBehav. Brain Res 80:9–25
- 27.Impaired object recognition with increasing levels of feature ambiguity in rats with perirhinal cortex lesionsBehav. Brain Res 148:79–91
- 28.Contributions of area te2 to rat recognition memoryLearn. & Mem 18:493–501
- 29.A critical role for the nucleus reuniens in long-term, but not short-term associative recognition memory formationJ. Neurosci 38:3208–3217
- 30.Collateral projections from nucleus reuniens of thalamus to hippocampus and medial prefrontal cortex in the rat: a single and double retrograde fluorescent labeling studyBrain Struct. Funct 217:191–209
- 31.The role of the nucleus reuniens of the thalamus in the recognition memory networkMaster’s thesis, School of Physiology, Pharmacology & Neuroscience University of Bristol
- 32.Dual midbrain and forebrain origins of thalamic inhibitory interneuronseLife 10
- 33.Tectal-derived interneurons contribute to phasic and tonic inhibition in the visual thalamusNat. Commun 7
- 34.Retinal input directs the recruitment of inhibitory interneurons into thalamic visual circuitsNeuron 81:1057–1069
- 35.Subcortical visual shell nuclei targeted by iprgcs develop from a sox14+-gabaergic progenitor and require sox14 to regulate daily activity rhythmsNeuron 75:648–662
- 36.Data analysis using regression and multilevel/hierarchical modelsCambridge University Press
- 37.The horseshoe estimator for sparse signalsBiometrika 97:465–480
- 38.Sparsity information and regularization in the horseshoe and other shrinkage priorsElectron. J. Stat 11
- 39.Stan: A probabilistic programming languageJ. Stat. Softw 76
- 40.Identifying the optimal integration time in Hamiltonian Monte CarloarXiv
- 41.The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte CarloJ. Mach. Learn. Res 15:1593–1623
- 42.Some statistical models for limited dependent variables with application to the demand for durable goodsEconom. J. Econom. Soc :829–844
- 43.Generating random correlation matrices based on vines and extended onion methodJ. Multivar. Analysis 100:1989–2001
- 44.Exley warburton nre lesion cell count data
- 45.Moore schultz sox14 expressing neurons
- 46.Inference from iterative simulation using multiple sequencesStat. Sci :457–472
- 47.Rank-normalization, folding, and localization: an improved r for assessing convergence of mcmc (with discussion)Bayesian Analysis 16:667–718
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Dimmock et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.