Virtual Brain Inference (VBI), a flexible and integrative toolkit for efficient probabilistic inference on whole-brain models
Figures
The workflow of Virtual Brain Inference (VBI).
This probabilistic approach is designed to estimate the posterior distribution of control parameters in virtual brain models from whole-brain recordings. (A) The process begins with constructing a personalized connectome using diffusion tensor imaging and a brain parcellation atlas, such as Desikan-Killiany (Desikan et al., 2006), Automated Anatomical Labeling (Tzourio-Mazoyer et al., 2002), or VEP (Wang et al., 2021). (B) The personalized virtual brain model is then assembled. Neural mass models describing the averaged activity of neural populations, in the generic form of , are placed to each brain region and interconnected via the structural connectivity matrix. Initially, the control parameters are randomly drawn from a simple prior distribution. (C) Next, the VBI operates as a simulator that uses these samples to generate time series data associated with neuroimaging recordings. (D) We extract a set of summary statistics from the low-dimensional features of the simulations (FC, FCD, PSD) for training. (E) Subsequently, a class of deep neural density estimators is trained on pairs of random parameters and their corresponding data features to learn the joint posterior distribution of the model parameters. (F) Finally, the amortized network allows us to quickly approximate the posterior distribution for new (empirical) data features, enabling us to make probabilistic predictions that are consistent with the observed data.
Flowchart of the VBI Structure.
This toolkit consists of three main modules: (1) The Simulation module, implementing various whole-brain models, such as Wilson-Cowan (WCo), Jansen-Rit (JR), Stuart-Landau (SL), Epileptor (EPi), Montbrió (MPR), and Wong-Wang (WW), across different numerical computing libraries (C++, Cupy, PyTorch, Numba). (2) The Features module, offering an extensive toolbox for extracting low-dimensional data features, such as spectral, temporal, connectivity, statistical, and information theory features. (3) The Inference module, providing neural density estimators (such as MAF and NSF) to approximate the posterior of parameters.
Benchmarking the computational cost of CPU/GPU and MAF/NSF.
(A) Benchmarking the simulation cost of the whole-brain network of the Montbrió model, with 2.5 M time steps on Intel(R) Core(TM) i9-10900 CPU 2.80 GHz (in red) and NVIDIA RTX A5000 GPU (in blue). GPUs deliver substantial speedups up to 100x over multi-core CPUs. (B) Comparing MAF and NSF density estimators, showing that MAF is typically two to four times faster than NSF during the training process.
Evaluating the accuracy of estimation (parameter ) with respect to the number of simulations used for training the MAF/NSF density estimator by (A) posterior shrinkage and (B) posterior z-score, versus the number of simulations.
Estimation of the global coupling parameter across whole-brain models.
The MAF density estimator was trained using the same features as described for the heterogeneous case in the main text. The simulation budget was limited to a maximum of 300 realizations. The green dashed line indicates an ideal estimation, while the gray dashed line represents a linear regression based on the mean values of the posterior sampling distributions. We can observe an accurate posterior estimation for the coupling parameter across different models and configurations, except for the Jansen-Rit model, where coupling did not induce a significant change in the intrinsic frequency of regional activity. Note that the Montbrió model shows more diffuse estimations compared to the others, due to switching dynamics driven by noise.
Bayesian inference on heterogeneous control parameters in the whole-brain network of Wilson-Cowan model.
The set of inferred parameters is , with the global scaling parameter and average external input current to excitatory populations per region, given parcelled regions. Summary statistics of power spectrum density (PSD) were used for training of MAF density estimators, with a budget of 260 k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted PSDs as the data features. (E) and (F) display the posterior distribution of per region, and global coupling , respectively. The ground truth and prior are represented by a vertical green line and a blue distribution, respectively. (G) shows the inference evaluation using posterior shrinkage and z-score.
Inference on heterogeneous control parameters across different configurations.
The set of unknown control parameters is . We observe accurate posterior estimation across different scenarios by varying the external input current map across brain regions. Features derived from the power spectrum density were used for training, with a simulation budget of 260 k. The regions are color-coded based on their current values: blue indicates low, green medium, and red high external input current and the size of the regions reflects the shrinkage of the posterior distribution for the corresponding inferred parameters. Top: axial view. Bottom: sagittal view. The three configurations (from left to right) correspond to: randomly assigned heterogeneous values, increased for the five strongest nodes, and increased for the five weakest nodes.
Inference using the MAF estimator, but ignoring the spatial information in the data features.
Here, we observe that the MAF density estimator leads to a multimodal posterior distribution for low values of , when trained with a reduced set of data features, suggesting increased uncertainty or potential non-identifiability under limited information.
Inference using the NSF estimator, but ignoring the spatial information in the data features.
Here, we observe more accurate estimation; however, NSF required approximately 6 hr of training compared to 20 min for MAF.
Bayesian inference on heterogeneous control parameters in the whole-brain network of the Jansen-Rit model.
The set of inferred parameters is , with the global scaling parameter and average numbers of synapse between neural populations per region, given parcelled regions. Summary statistics of power spectrum density (PSD) were used for training, with a budget of 50 k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted data features, such as PSD. (E) and (F) display the posterior distribution of per region, and global coupling , respectively. The ground truth and prior are represented by vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation using the shrinkage and z-score of the estimated posterior distributions.
Inference on heterogeneous control parameters while excluding total power from the data features.
The set of inferred parameters is, consisting of the global scaling parameter and the average number of synapses between neural populations per region, with parcelled regions. For panels A to C, only the area under the curve of the power spectral density (PSD) in the [0,30] Hz band was used, while for panels D to F, additional statistical features of the PSD were included in the training, with a total budget of 50 k simulations. (A) and (D) illustrate the observed and predicted EEG signals. (B) and (E) show the observed and predicted PSD signals. (C) and (F) display the posterior distribution of for each region. The true values are indicated by vertical green lines, and the prior distribution is shown in blue.
Bayesian inference on global scaling parameter and the averaged velocity of signal transmission using the whole-brain network model of Stuart-Landau oscillators.
The set of estimated parameters is , and the summary statistics of PSD signals with a budget of 2 k simulations were used for training. (A) illustrates exemplary observed and predicted neural activities (in green and red, respectively). (B) shows the observed and predicted PSD signals (in green and red, respectively). (C) and (D) display the posterior distribution of averaged velocity and global coupling , respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (E) shows the joint posterior distribution indicating a high correlation between posterior samples. (F) illustrates the sensitivity analysis based on the eigenvalues of the posterior distribution.
Bayesian inference on the spatial map of epileptogenicity across brain regions in the VEP model.
The set of inferred parameters is , as the global scaling parameter and spatial map of epileptogenicity with parcelled regions. (A) The observed seizure envelope generated by the Epileptor model, given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF model on a dataset containing 10 k simulations, using only the total power and seizure onset per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distributions of heterogeneous control parameters , and global coupling parameter , respectively. (G) The posterior z-scores versus posterior shrinkages for estimated parameters.
Inference with slow time scale separation, .
(A) The observed seizure envelope generated by the 2D Epileptor model given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF density estimator on a dataset containing 10 k simulations, using only the total power per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distribution of heterogeneous , and global coupling parameter , respectively. (G) The posterior z-scores versus posterior shrinkages for estimated parameters.
Evaluating the performance of estimation under different levels of additive noise.
(A) Simulated time series under different levels of additive noise. (B) and (C) posterior shrinkage and z-score, respectively, as a function of noise level . (D) Ground truth (in green), prior (in blue), and posterior distributions (in red) for global coupling parameter. We observe that at higher noise levels (e.g. and ), the ground truth is not captured within the posterior, leading to increased z-scores compared to the noise-free case .
Evaluating the performance of estimation under different levels of dynamical noise.
(A) Simulated time series generated under different levels of dynamical noise. (B) and (C) posterior shrinkage and z-score, respectively, across different scenarios, demonstrating the accuracy and reliability of estimating the dynamical noise parameter. (D) Ground truth (in green), prior (in blue), and posterior distributions (in red), for different realizations of the dynamical noise parameter in the system. We observe robust estimation even at high levels of dynamical noise, though for the posterior sampling fails (shaded area).
Bayesian inference on heterogeneous control parameters in the whole-brain dynamics using the Montbrió model.
The set of inferred parameters is , as the global scaling parameter and excitability per region, with parcelled regions. VBI provides accurate and reliable posterior estimation using both spatio-temporal and functional data features for training, with a budget of 500 k simulations. (A) and (B) illustrate the observed and predicted BOLD signals, respectively. (C) and (D) show the observed (upper triangular) and predicted (lower triangular) data features (FC and FCD), respectively. (E) and (F) display the posterior distribution of excitability parameters per region, and global coupling , respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation by the shrinkage and z-score of the posterior distributions.
Inference across different configurations.
The set of inferred parameters is , as the global scaling parameter and excitability per region, with parcelled regions.
Scatter plot of summary statistics used for training.
Scatter plot of summary statistics from the FC/FCD matrices in the whole-brain model of Montbrió.
Inference on heterogeneous control parameters using only FC/FCD matrices.
Here, is the global scaling parameter, and represents the excitability of each brain region, where corresponds to the parcelled regions. Relying on only functional data features (FC/FCD) for training proves insufficient to accurately estimate the posterior distribution of heterogeneous excitabilities, even with a budget of 500 k simulations. In (A), the posterior estimates are shown in red, the true values of in green, and the prior distribution in blue. Panels (B) and (C) show the observed and predicted FC/FCD matrices, respectively, as the data features. The simulation to generate training data took approximately 24 hr, utilizing 10 GPUs concurrently. The training process lasted around 8 hr, while sampling required just a few seconds.
Estimation of the coupling parameter using only functional data features via VBI simulator.
(A) The uniform prior distribution of in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) the observed and predicted FC/FCD matrices, respectively.
Estimation of the coupling parameter using only functional data features via TVB simulator.
(A) The uniform prior distribution of in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) show the observed and predicted FC/FCD matrices, respectively.
Evaluating the performance of estimation under different levels of additive noise.
(A) and (B) simulated time series and corresponding FCD matrices, respectively, under different levels of additive noise. (C) and (D) posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground truth (in green), prior (in blue), and posterior distribution (in red), for estimation of coupling parameter , given different values of additive noise to the BOLD signals, indicating robust inference. However, for very high values of noise () the posterior sampling fails (shaded area).
Evaluating the performance of estimation under different levels of dynamical noise.
(A) and (B) simulated time series and corresponding FCD matrices, respectively, under different levels of dynamical noise. (C) and (D) posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground truth (in green), prior (in blue), and posterior distributions (in red), for different realizations of the dynamical noise parameter in the system. We observe robust estimation even at high levels of dynamical noise.
Bayesian inference on the parametric mean-field model of Wong-Wang (also known as pDMF model), with linear coefficients , reparameterizing the recurrent connection strength , external input current , and noise amplitude for each region.
Summary statistics of spatio-temporal and functional data features were used for training, with a budget of 200 k simulations. (A) The diagonal panels display the ground-true values (in green), the uniform prior (in blue), and the estimated posterior distributions (in red). The upper diagonal panels illustrate the joint posterior distributions between parameters, along with their correlation (, in the upper left corners), and ground-truth values (green stars). High-probability areas are color-coded in yellow, while low-probability areas are shown in black. (B) The observed and predicted BOLD time series (in green and red, respectively). (C) The observed and predicted data features, such as FC/FCD matrices. (D) The inference evaluation by calculating the shrinkage and z-score of the estimated posterior distributions.
The distribution of observed and predicted data features in the pDMF model.
(A) The functional connectivity (FC), and (B) the functional connectivity dynamics (FCD).
The non-identifiability in estimating and in the pDMF model.
A total of 16 k simulations and functional data features were used for training. The upper row illustrates the FCD variance and the sum of FC matrices plotted against the parameters and , respectively, for three selected observations marked by red circles. The lower panels display the parameter estimations corresponding to the selected observations. Due to structural non-identifiability in the product of and , the posterior distributions appear diffuse, showing a nonlinear dependency (banana shape) in the joint posterior plots.
Tables
Parameter descriptions for capturing whole-brain dynamics using the Wilson-Cowan neural mass model.
| Parameter | Description | Value | Prior |
|---|---|---|---|
| Excitatory to excitatory synaptic strength | 16.0 | ||
| Inhibitory to excitatory synaptic strength | 12.0 | ||
| Excitatory to inhibitory synaptic strength | 15.0 | ||
| Inhibitory to inhibitory synaptic strength | 3.0 | ||
| Time constant of excitatory population | 8.0 | ||
| Time constant of inhibitory population | 8.0 | ||
| Sigmoid slope for excitatory population | 1.3 | ||
| Sigmoid slope for inhibitory population | 2.0 | ||
| Sigmoid threshold for excitatory population | 4.0 | ||
| Sigmoid threshold for inhibitory population | 3.7 | ||
| Maximum output of sigmoid for excitatory population | 1.0 | ||
| Maximum output of sigmoid for inhibitory population | 1.0 | ||
| Firing threshold for excitatory population | 0.0 | ||
| Firing threshold for inhibitory population | 0.0 | ||
| Refractoriness of excitatory population | 1.0 | ||
| Refractoriness of inhibitory population | 1.0 | ||
| Scaling constant for excitatory output | 0.994 | ||
| Scaling constant for inhibitory output | 0.999 | ||
| Gain of excitatory population | 1.0 | ||
| Gain of inhibitory population | 1.0 | ||
| External input to excitatory population | 0.0 | ||
| External input to inhibitory population | 0.0 | ||
| Global coupling strength (excitatory) | 0.0 | ||
| Global coupling strength (inhibitory) | 0.0 | ||
| Standard deviation of Gaussian noise | 0.005 |
Parameter descriptions for capturing whole-brain dynamics using Jansen-Rit neural mass model.
EP: excitatory populations, IP: inhibitory populations, PSP: post synaptic potential, PSPA: post synaptic potential amplitude.
| Parameters | Description | Value | Prior |
|---|---|---|---|
| Excitatory PSPA | 3.25 mV | ||
| Inhibitory PSPA | 22 mV | ||
| Time constant of excitatory PSP | |||
| Time constant of inhibitory PSP | |||
| Average numbers of synapses between EP | |||
| Average numbers of synapses between IP | |||
| Maximum firing rate | 5 Hz | ||
| Potential at half of maximum firing rate | 6 mV | ||
| Slope of sigmoid function at | |||
| Average numbers of synapses between neural populations | 135 | ||
| Scaling the strength of network connections | 1.5 |
Parameter descriptions for capturing whole-brain dynamics using Stuart-Landau oscillator.
| Parameters | Description | Value | Prior |
|---|---|---|---|
| Bifurcation parameter | -5 | ||
| Natural angular frequency | |||
| Noise factor | 10-4 | ||
| Average conduction velocity | 6.0 m/s | ||
| Global coupling parameter | 350 |
Parameter descriptions for capturing whole-brain dynamics using 2D Epileptor neural mass model.
| Parameter | Description | Value | Prior |
|---|---|---|---|
| Input electric current | 3.1 | ||
| System time constant | 90ms | ||
| Spatial map of epileptogenicity | –3.65 | ||
| Global scaling factor on network connections | 1.0 |
Parameter descriptions for capturing whole-brain dynamics using Montbrió model.
| Parameter | Description | Nominal value | Prior |
|---|---|---|---|
| Characteristic time constant | 1 ms | ||
| Synaptic weight | 14.5ms-1 | ||
| Spread of the heterogeneous noise distribution | 0.7ms-1 | ||
| Input current representing stimulation | 0.0 | ||
| Gaussian noise variance | 0.037 | ||
| Excitability | –4.6 | ||
| Scaling the strength of network connections | 0.56 |
Parameter descriptions for capturing whole-brain dynamics using the Wong-Wang model.
| Parameter | Description | Value | Prior |
|---|---|---|---|
| Max feeding rate of | 270 n/C | ||
| Half saturation of | 108 Hz | ||
| Control the steepness of curve of | 0.154 s | ||
| Kinetic parameter | 0.641/1000 | ||
| Synaptic time constant | 100 ms | ||
| Synaptic coupling | 0.2609 nA | ||
| Local excitatory recurrence | 0.6 | ||
| Overall effective external input | 0.3 nA | ||
| Scaling the strength of network connections | 6.28 | ||
| Noise amplitude | 0.005 |
Parameter descriptions for the Balloon-Windkessel model to map neural activity to the BOLD signals detected in fMRI.
| Parameter | Description | Value |
|---|---|---|
| Rate constant of vasodilatory signal decay in seconds | 1.5 | |
| Time of flow-dependent elimination in seconds | 4.5 | |
| Grubb’s vessel stiffness exponent | 0.2 | |
| Hemodynamic transit time in seconds | 1.0 | |
| Efficacy of synaptic activity to induce signal | 0.1 | |
| Slope of intravascular relaxation rate in Hertz | 25.0 | |
| Frequency offset at outer surface of magnetized vessels | 40.3 | |
| Ratio of intra- and extravascular BOLD signal at rest | 1.43 | |
| Resting blood volume fraction | 0.02 | |
| Resting oxygen extraction fraction | 0.8 | |
| Echo time, 1.5 T scanner | 0.04 |