Virtual Brain Inference (VBI), a flexible and integrative toolkit for efficient probabilistic inference on whole-brain models

  1. Abolfazl Ziaeemehr  Is a corresponding author
  2. Marmaduke Woodman
  3. Lia Domide
  4. Spase Petkoski
  5. Viktor Jirsa  Is a corresponding author
  6. Meysam Hashemi  Is a corresponding author
  1. Aix Marseille University, INSERM, INS, Inst Neurosci System, France
  2. Codemart, Romania
8 figures, 7 tables and 1 additional file

Figures

The workflow of Virtual Brain Inference (VBI).

This probabilistic approach is designed to estimate the posterior distribution of control parameters in virtual brain models from whole-brain recordings. (A) The process begins with constructing a personalized connectome using diffusion tensor imaging and a brain parcellation atlas, such as Desikan-Killiany (Desikan et al., 2006), Automated Anatomical Labeling (Tzourio-Mazoyer et al., 2002), or VEP (Wang et al., 2021). (B) The personalized virtual brain model is then assembled. Neural mass models describing the averaged activity of neural populations, in the generic form of x˙=f(x,θ,Iinput), are placed to each brain region and interconnected via the structural connectivity matrix. Initially, the control parameters are randomly drawn from a simple prior distribution. (C) Next, the VBI operates as a simulator that uses these samples to generate time series data associated with neuroimaging recordings. (D) We extract a set of summary statistics from the low-dimensional features of the simulations (FC, FCD, PSD) for training. (E) Subsequently, a class of deep neural density estimators is trained on pairs of random parameters and their corresponding data features to learn the joint posterior distribution of the model parameters. (F) Finally, the amortized network allows us to quickly approximate the posterior distribution for new (empirical) data features, enabling us to make probabilistic predictions that are consistent with the observed data.

Figure 2 with 3 supplements
Flowchart of the VBI Structure.

This toolkit consists of three main modules: (1) The Simulation module, implementing various whole-brain models, such as Wilson-Cowan (WCo), Jansen-Rit (JR), Stuart-Landau (SL), Epileptor (EPi), Montbrió (MPR), and Wong-Wang (WW), across different numerical computing libraries (C++, Cupy, PyTorch, Numba). (2) The Features module, offering an extensive toolbox for extracting low-dimensional data features, such as spectral, temporal, connectivity, statistical, and information theory features. (3) The Inference module, providing neural density estimators (such as MAF and NSF) to approximate the posterior of parameters.

Figure 2—figure supplement 1
Benchmarking the computational cost of CPU/GPU and MAF/NSF.

(A) Benchmarking the simulation cost of the whole-brain network of the Montbrió model, with 2.5 M time steps on Intel(R) Core(TM) i9-10900 CPU 2.80 GHz (in red) and NVIDIA RTX A5000 GPU (in blue). GPUs deliver substantial speedups up to 100x over multi-core CPUs. (B) Comparing MAF and NSF density estimators, showing that MAF is typically two to four times faster than NSF during the training process.

Figure 2—figure supplement 2
Evaluating the accuracy of estimation (parameter G) with respect to the number of simulations used for training the MAF/NSF density estimator by (A) posterior shrinkage and (B) posterior z-score, versus the number of simulations.
Figure 2—figure supplement 3
Estimation of the global coupling parameter across whole-brain models.

The MAF density estimator was trained using the same features as described for the heterogeneous case in the main text. The simulation budget was limited to a maximum of 300 realizations. The green dashed line indicates an ideal estimation, while the gray dashed line represents a linear regression based on the mean values of the posterior sampling distributions. We can observe an accurate posterior estimation for the coupling parameter across different models and configurations, except for the Jansen-Rit model, where coupling did not induce a significant change in the intrinsic frequency of regional activity. Note that the Montbrió model shows more diffuse estimations compared to the others, due to switching dynamics driven by noise.

Figure 3 with 3 supplements
Bayesian inference on heterogeneous control parameters in the whole-brain network of Wilson-Cowan model.

The set of inferred parameters is θ={G,Pi}R89, with the global scaling parameter G and average external input current to excitatory populations Pi per region, given i{1,2,...,Nn=88} parcelled regions. Summary statistics of power spectrum density (PSD) were used for training of MAF density estimators, with a budget of 260 k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted PSDs as the data features. (E) and (F) display the posterior distribution of Pi per region, and global coupling G, respectively. The ground truth and prior are represented by a vertical green line and a blue distribution, respectively. (G) shows the inference evaluation using posterior shrinkage and z-score.

Figure 3—figure supplement 1
Inference on heterogeneous control parameters across different configurations.

The set of unknown control parameters is θ={G,Pi}R89. We observe accurate posterior estimation across different scenarios by varying the external input current map across brain regions. Features derived from the power spectrum density were used for training, with a simulation budget of 260 k. The regions are color-coded based on their current values: blue indicates low, green medium, and red high external input current and the size of the regions reflects the shrinkage of the posterior distribution for the corresponding inferred parameters. Top: axial view. Bottom: sagittal view. The three configurations (from left to right) correspond to: randomly assigned heterogeneous Pi values, increased Pi for the five strongest nodes, and Pi increased for the five weakest nodes.

Figure 3—figure supplement 2
Inference using the MAF estimator, but ignoring the spatial information in the data features.

Here, we observe that the MAF density estimator leads to a multimodal posterior distribution for low values of Pi, when trained with a reduced set of data features, suggesting increased uncertainty or potential non-identifiability under limited information.

Figure 3—figure supplement 3
Inference using the NSF estimator, but ignoring the spatial information in the data features.

Here, we observe more accurate estimation; however, NSF required approximately 6 hr of training compared to 20 min for MAF.

Figure 4 with 1 supplement
Bayesian inference on heterogeneous control parameters in the whole-brain network of the Jansen-Rit model.

The set of inferred parameters is θ={G,Ci}R89, with the global scaling parameter G and average numbers of synapse between neural populations Ci per region, given i{1,2,...,Nn=88} parcelled regions. Summary statistics of power spectrum density (PSD) were used for training, with a budget of 50 k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted data features, such as PSD. (E) and (F) display the posterior distribution of Ci per region, and global coupling G, respectively. The ground truth and prior are represented by vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation using the shrinkage and z-score of the estimated posterior distributions.

Figure 4—figure supplement 1
Inference on heterogeneous control parameters while excluding total power from the data features.

The set of inferred parameters is,θ={G,Ci}R89 consisting of the global scaling parameter G and the average number of synapses between neural populations Ci per region, with i{1,2,...,Nn=88} parcelled regions. For panels A to C, only the area under the curve of the power spectral density (PSD) in the [0,30] Hz band was used, while for panels D to F, additional statistical features of the PSD were included in the training, with a total budget of 50 k simulations. (A) and (D) illustrate the observed and predicted EEG signals. (B) and (E) show the observed and predicted PSD signals. (C) and (F) display the posterior distribution of Ci for each region. The true values are indicated by vertical green lines, and the prior distribution is shown in blue.

Bayesian inference on global scaling parameter G and the averaged velocity V of signal transmission using the whole-brain network model of Stuart-Landau oscillators.

The set of estimated parameters is θ={G,V}R2, and the summary statistics of PSD signals with a budget of 2 k simulations were used for training. (A) illustrates exemplary observed and predicted neural activities (in green and red, respectively). (B) shows the observed and predicted PSD signals (in green and red, respectively). (C) and (D) display the posterior distribution of averaged velocity V and global coupling G, respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (E) shows the joint posterior distribution indicating a high correlation between posterior samples. (F) illustrates the sensitivity analysis based on the eigenvalues of the posterior distribution.

Figure 6 with 3 supplements
Bayesian inference on the spatial map of epileptogenicity across brain regions in the VEP model.

The set of inferred parameters is θ={G,ηi}R89, as the global scaling parameter and spatial map of epileptogenicity with i{1,2,...,Nn=88} parcelled regions. (A) The observed seizure envelope generated by the Epileptor model, given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF model on a dataset containing 10 k simulations, using only the total power and seizure onset per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distributions of heterogeneous control parameters ηi, and global coupling parameter G, respectively. (G) The posterior z-scores versus posterior shrinkages for estimated parameters.

Figure 6—figure supplement 1
Inference with slow time scale separation, τ=10ms.

(A) The observed seizure envelope generated by the 2D Epileptor model given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF density estimator on a dataset containing 10 k simulations, using only the total power per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distribution of heterogeneous ηi, and global coupling parameter G, respectively. (G) The posterior z-scores versus posterior shrinkages for estimated parameters.

Figure 6—figure supplement 2
Evaluating the performance of estimation under different levels of additive noise.

(A) Simulated time series under different levels of additive noise. (B) and (C) posterior shrinkage and z-score, respectively, as a function of noise level σ. (D) Ground truth (in green), prior (in blue), and posterior distributions (in red) for global coupling parameter. We observe that at higher noise levels (e.g. σ=0.3 and σ=0.6), the ground truth is not captured within the posterior, leading to increased z-scores compared to the noise-free case σ=0.

Figure 6—figure supplement 3
Evaluating the performance of estimation under different levels of dynamical noise.

(A) Simulated time series generated under different levels of dynamical noise. (B) and (C) posterior shrinkage and z-score, respectively, across different scenarios, demonstrating the accuracy and reliability of estimating the dynamical noise parameter. (D) Ground truth (in green), prior (in blue), and posterior distributions (in red), for different realizations of the dynamical noise parameter σ in the system. We observe robust estimation even at high levels of dynamical noise, though for σ1.6 the posterior sampling fails (shaded area).

Figure 7 with 7 supplements
Bayesian inference on heterogeneous control parameters in the whole-brain dynamics using the Montbrió model.

The set of inferred parameters is θ={G,ηi}R89, as the global scaling parameter and excitability per region, with i{1,2,...,Nn=88} parcelled regions. VBI provides accurate and reliable posterior estimation using both spatio-temporal and functional data features for training, with a budget of 500 k simulations. (A) and (B) illustrate the observed and predicted BOLD signals, respectively. (C) and (D) show the observed (upper triangular) and predicted (lower triangular) data features (FC and FCD), respectively. (E) and (F) display the posterior distribution of excitability parameters ηi per region, and global coupling G, respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation by the shrinkage and z-score of the posterior distributions.

Figure 7—figure supplement 1
Inference across different configurations.

The set of inferred parameters is θ={G,ηi}R89, as the global scaling parameter and excitability per region, with i{1,2,...,Nn=88} parcelled regions.

Figure 7—figure supplement 2
Scatter plot of summary statistics used for training.

Scatter plot of summary statistics from the FC/FCD matrices in the whole-brain model of Montbrió.

Figure 7—figure supplement 3
Inference on heterogeneous control parameters using only FC/FCD matrices.

Here, G is the global scaling parameter, and ηi represents the excitability of each brain region, where i{1,2,,Nn=88} corresponds to the parcelled regions. Relying on only functional data features (FC/FCD) for training proves insufficient to accurately estimate the posterior distribution of heterogeneous excitabilities, even with a budget of 500 k simulations. In (A), the posterior estimates are shown in red, the true values of ηi in green, and the prior distribution in blue. Panels (B) and (C) show the observed and predicted FC/FCD matrices, respectively, as the data features. The simulation to generate training data took approximately 24 hr, utilizing 10 GPUs concurrently. The training process lasted around 8 hr, while sampling required just a few seconds.

Figure 7—figure supplement 4
Estimation of the coupling parameter using only functional data features via VBI simulator.

(A) The uniform prior distribution of G in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) the observed and predicted FC/FCD matrices, respectively.

Figure 7—figure supplement 5
Estimation of the coupling parameter using only functional data features via TVB simulator.

(A) The uniform prior distribution of G in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) show the observed and predicted FC/FCD matrices, respectively.

Figure 7—figure supplement 6
Evaluating the performance of estimation under different levels of additive noise.

(A) and (B) simulated time series and corresponding FCD matrices, respectively, under different levels of additive noise. (C) and (D) posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground truth (in green), prior (in blue), and posterior distribution (in red), for estimation of coupling parameter G, given different values of additive noise to the BOLD signals, indicating robust inference. However, for very high values of noise (σ0.45) the posterior sampling fails (shaded area).

Figure 7—figure supplement 7
Evaluating the performance of estimation under different levels of dynamical noise.

(A) and (B) simulated time series and corresponding FCD matrices, respectively, under different levels of dynamical noise. (C) and (D) posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground truth (in green), prior (in blue), and posterior distributions (in red), for different realizations of the dynamical noise parameter σ in the system. We observe robust estimation even at high levels of dynamical noise.

Figure 8 with 3 supplements
Bayesian inference on the parametric mean-field model of Wong-Wang (also known as pDMF model), with linear coefficients (aw,bw,cw,aI,bI,cI,aσ,bσ,cσ)R9, reparameterizing the recurrent connection strength wi, external input current Ii, and noise amplitude σi for each region.

Summary statistics of spatio-temporal and functional data features were used for training, with a budget of 200 k simulations. (A) The diagonal panels display the ground-true values (in green), the uniform prior (in blue), and the estimated posterior distributions (in red). The upper diagonal panels illustrate the joint posterior distributions between parameters, along with their correlation (ρ, in the upper left corners), and ground-truth values (green stars). High-probability areas are color-coded in yellow, while low-probability areas are shown in black. (B) The observed and predicted BOLD time series (in green and red, respectively). (C) The observed and predicted data features, such as FC/FCD matrices. (D) The inference evaluation by calculating the shrinkage and z-score of the estimated posterior distributions.

Figure 8—figure supplement 1
The distribution of observed and predicted data features in the pDMF model.

(A) The functional connectivity (FC), and (B) the functional connectivity dynamics (FCD).

Figure 8—figure supplement 2
The non-identifiability in estimating G and J in the pDMF model.

A total of 16 k simulations and functional data features were used for training. The upper row illustrates the FCD variance and the sum of FC matrices plotted against the parameters G and J, respectively, for three selected observations marked by red circles. The lower panels display the parameter estimations corresponding to the selected observations. Due to structural non-identifiability in the product of G and J, the posterior distributions appear diffuse, showing a nonlinear dependency (banana shape) in the joint posterior plots.

Figure 8—figure supplement 3
Scatter plot of summary statistics used for training.

Scatter plot of summary statistics from the FC/FCD matrices in the pDMF model.

Tables

Table 1
Parameter descriptions for capturing whole-brain dynamics using the Wilson-Cowan neural mass model.
ParameterDescriptionValuePrior
ceeExcitatory to excitatory synaptic strength16.0
ceiInhibitory to excitatory synaptic strength12.0
cieExcitatory to inhibitory synaptic strength15.0
ciiInhibitory to inhibitory synaptic strength3.0
τeTime constant of excitatory population8.0
τiTime constant of inhibitory population8.0
aeSigmoid slope for excitatory population1.3
aiSigmoid slope for inhibitory population2.0
beSigmoid threshold for excitatory population4.0
biSigmoid threshold for inhibitory population3.7
ceMaximum output of sigmoid for excitatory population1.0
ciMaximum output of sigmoid for inhibitory population1.0
θeFiring threshold for excitatory population0.0
θiFiring threshold for inhibitory population0.0
reRefractoriness of excitatory population1.0
riRefractoriness of inhibitory population1.0
keScaling constant for excitatory output0.994
kiScaling constant for inhibitory output0.999
αeGain of excitatory population1.0
αiGain of inhibitory population1.0
PExternal input to excitatory population0.0U(0,3)
QExternal input to inhibitory population0.0U(0,3)
geGlobal coupling strength (excitatory)0.0U(0,1)
giGlobal coupling strength (inhibitory)0.0U(0,1)
σStandard deviation of Gaussian noise0.005
Table 2
Parameter descriptions for capturing whole-brain dynamics using Jansen-Rit neural mass model.

EP: excitatory populations, IP: inhibitory populations, PSP: post synaptic potential, PSPA: post synaptic potential amplitude.

ParametersDescriptionValuePrior
AExcitatory PSPA3.25 mV
BInhibitory PSPA22 mV
1/aTime constant of excitatory PSPa=100s1
1/bTime constant of inhibitory PSPb=50s1
C1,C2Average numbers of synapses between EP1C,0.8C
C3,C4Average numbers of synapses between IP0.25C
vmaxMaximum firing rate5 Hz
v0Potential at half of maximum firing rate6 mV
rSlope of sigmoid function at v00.56mV1
CAverage numbers of synapses between neural populations135U(100,500)
GScaling the strength of network connections1.5U(0,5)
Table 3
Parameter descriptions for capturing whole-brain dynamics using Stuart-Landau oscillator.
ParametersDescriptionValuePrior
aBifurcation parameter-5
ωiNatural angular frequency2π40rad/s
σNoise factor10-4
vAverage conduction velocity6.0 m/sU(1,30)
GGlobal coupling parameter350U(0,400)
Table 4
Parameter descriptions for capturing whole-brain dynamics using 2D Epileptor neural mass model.
ParameterDescriptionValuePrior
IInput electric current3.1
τSystem time constant90ms
ηiSpatial map of epileptogenicity–3.65U(5,1)
GGlobal scaling factor on network connections1.0U(0,2)
Table 5
Parameter descriptions for capturing whole-brain dynamics using Montbrió model.
ParameterDescriptionNominal valuePrior
τCharacteristic time constant1 ms
JSynaptic weight14.5ms-1
ΔSpread of the heterogeneous noise distribution0.7ms-1
Istim(t)Input current representing stimulation0.0
σGaussian noise variance0.037
ηExcitability–4.6U(6,3.5)
GScaling the strength of network connections0.56U(0,1)
Table 6
Parameter descriptions for capturing whole-brain dynamics using the Wong-Wang model.
ParameterDescriptionValuePrior
aMax feeding rate of H(x)270 n/C
bHalf saturation of H(x)108 Hz
dControl the steepness of curve of H(x)0.154 s
γKinetic parameter0.641/1000
τsSynaptic time constant100 ms
JSynaptic coupling0.2609 nA
wLocal excitatory recurrence0.6U(0,1)
IOverall effective external input0.3 nAU(0,0.5)
GScaling the strength of network connections6.28U(1,10)
σNoise amplitude0.005U(0.0005,0.01)
Table 7
Parameter descriptions for the Balloon-Windkessel model to map neural activity to the BOLD signals detected in fMRI.
ParameterDescriptionValue
τsRate constant of vasodilatory signal decay in seconds1.5
τfTime of flow-dependent elimination in seconds4.5
αGrubb’s vessel stiffness exponent0.2
τ0Hemodynamic transit time in seconds1.0
ϵEfficacy of synaptic activity to induce signal0.1
r0Slope of intravascular relaxation rate in Hertz25.0
ϑ0Frequency offset at outer surface of magnetized vessels40.3
εRatio of intra- and extravascular BOLD signal at rest1.43
V0Resting blood volume fraction0.02
E0Resting oxygen extraction fraction0.8
TEEcho time, 1.5 T scanner0.04

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Abolfazl Ziaeemehr
  2. Marmaduke Woodman
  3. Lia Domide
  4. Spase Petkoski
  5. Viktor Jirsa
  6. Meysam Hashemi
(2025)
Virtual Brain Inference (VBI), a flexible and integrative toolkit for efficient probabilistic inference on whole-brain models
eLife 14:RP106194.
https://doi.org/10.7554/eLife.106194.4