This probabilistic approach is designed to estimate the posterior distribution of control parameters in virtual brain models from whole-brain recordings. (A) The process begins with constructing a personalized connectome using diffusion tensor imaging and a brain parcellation atlas, such as Desikan-Killiany [145], Automated Anatomical Labeling [146], or VEP [147]. (B) The personalized virtual brain model is then assembled. Neural mass models describing the averaged activity of neural populations, in the generic form of , are placed to each brain region, and interconnected via the structural connectivity matrix. Initially, the control parameters are randomly drawn from a simple prior distribution. (C) Next, the VBI operates as a simulator that uses these samples to generate time series data associated with neuroimaging recordings. (D) We extract a set summary statistics from the low-dimensional features of the simulations (FC, FCD, PSD) for training. (E) Subsequently, a class of deep neural density estimators is trained on pairs of random parameters and their corresponding data features to learn the joint posterior distribution of the model parameters. (F) Finally, the amortized network allows us to quickly approximate the posterior distribution for new (empirical) data features, enabling us to make probabilistic predictions that are consistent with the observed data.
Parameter descriptions for capturing whole-brain dynamics using the Wilson-Cowan neural mass model.
Parameter descriptions for capturing whole-brain dynamics using Jansen-Rit neural mass model.
EP: excitatory populations, IP: inhibitory populations, PSP: post synaptic potential, PSPA: post synaptic potential amplitude.
Parameter descriptions for capturing whole-brain dynamics using Stuart-Landau oscillator.
Parameter descriptions for capturing whole-brain dynamics using 2D Epileptor neural mass model.
Parameter descriptions for capturing whole-brain dynamics using Montbrió model.
Parameter descriptions for capturing whole-brain dynamics using the Wong-Wang model.
Parameter descriptions for the Balloon-Windkessel model to map neural activity to the BOLD signals detected in fMRI.
Flowchart of the VBI Structure.
This toolkit consists of three main modules: (1) The Simulation module, implementing various whole-brain models, such as Wilson-Cowan (WCo), Jansen-Rit (JR), Stuart-Landau (SL), Epileptor (EPi), Montbrió (MPR), and Wong-Wang (WW), across different numerical computing libraries (C++, Cupy, PyTorch, Numba). (2) The Features module, offering an extensive toolbox for extracting low-dimensional data features, such as spectral, temporal, connectivity, statistical, and information theory features. (3) The Inference module, providing neural density estimators (such as MAF and NSF) to approximate the posterior of parameters.
Bayesian inference on heterogeneous control parameters in the whole-brain network of Wilson-Cowan model.
The set of inferred parameters is , with the global scaling parameter G and average external input current to excitatory populations per Pi region, given i ∈ {1, 2, …, Nn = 88} parcelled regions. Summary statistics of power spectrum density (PSD) were used for training of MAF density estimators, with a budget of 260k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted PSDs as the data. (E) and (F) display the posterior distribution of Pi per region, and global coupling G, respectively. The ground truth and prior are represented by a vertical green line and a blue distribution, respectively. (G) shows the inference evaluation using posterior shrinkage and z-score.
Bayesian inference on heterogeneous control parameters in the whole-brain network of Jansen-Rit model.
The set of inferred parameters is , with the global scaling parameter G and average numbers of synapse between neural populations per Ci region, given i ∈ {1, 2, …, Nn = 88} parcelled regions. Summary statistics of power spectrum density (PSD) were used for training, with a budget of 50k simulations. (A) and (B) illustrate the observed and predicted neural activities, respectively. (C) and (D) show the observed and predicted data features, such as PSD. (E) and (F) display the posterior distribution of Ci per region, and global coupling G, respectively. The ground truth and prior are represented by vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation using the shrinkage and z-score of the estimated posterior distributions.
Bayesian inference on global scaling parameter G and the averaged velocity V of signal transmission using the whole-brain network model of Stuart-Landau oscillators.
The set of estimated parameters is , and the summary statistics of PSD signals with a budget of 2k simulations were used for training. (A) illustrates exemplary observed and predicted neural activities (in green and red, respectively). (B) shows the observed and predicted PSD signals (in green and red, respectively). (C) and (D) display the posterior distribution of averaged velocity V and global coupling G, respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (E) shows the joint posterior distribution indicating a high correlation between posterior samples. (F) illustrates the sensitivity analysis based on the eigenvalues of the posterior distribution.
Bayesian inference on the spatial map of epileptogenicity across brain regions in the VEP model.
The set of inferred parameters is , as the global scaling parameter and spatial map of epileptogenicity with i ∈ {1, 2, …, Nn = 88}parcelled regions. (A) The observed seizure envelope generated by the Epileptor model, given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF model on a dataset containing 10k simulations, using only the total power and seizure onset per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distributions of heterogeneous control parameters ηi, and global coupling parameter G, respectively. (G) The posterior z-scores versus posterior shrinkages for estimated parameters.
Bayesian inference on heterogeneous control parameters in the whole-brain dynamics using Montbrió model.
The set of inferred parameters is , as the global scaling parameter and excitability per region, with i ∈{1, 2, …, Nn = 88 } parcelled regions. VBI provides accurate and reliable posterior estimation using both spatio-temporal and functional data features for training, with a budget of 500k simulations. (A) and (B) illustrate the observed and predicted BOLD signals, respectively. (C) and (D) show the observed (upper triangular) and predicted (lower triangular) data features (FC and FCD), respectively. (E) and (F) display the posterior distribution of excitability parameters ηi per region, and global coupling G, respectively. The true values and prior are shown as vertical green lines and a blue distribution, respectively. (G) shows the inference evaluation by the shrinkage and z-score of the posterior distributions.
Bayesian inference on the parametric mean-field model of Wong-Wang (also known as pDMF model), with linear coefficients (aw, bw, cw, aI, bI, cI, aσ, bσ, cσ) ∈ ℝ9, reparameterizing the recurrent connection strength wi, external input current Ii, and noise amplitude σi for each region.
Summary statistics of spatio-temporal and functional data features were used for training, with a budget of 50k simulations. (A) The diagonal panels display the ground-true values (in green), the uniform prior (in blue), and the estimated posterior distributions (in red). The upper diagonal panels illustrate the joint posterior distributions between parameters, along with their correlation (ρ, in the upper left corners), and ground-truth values (green stars). High-probability areas are color-coded in yellow, while low-probability areas are shown in black. (B) The observed and predicted BOLD time series (in green and red, respectively). (C) The observed and predicted data features, such as FC/FCD matrices. (D) The inference evaluation by calculating the shrinkage and z-score of the estimated posterior distributions.
(A) Benchmarking the simulation cost of the whole-brain network of Montbrió model, with 2.5M time step on Intel(R) Core(TM) i9-10900 CPU 2.80 GHz (in red) and NVIDIA RTX A5000 GPU (in blue). GPUs deliver substantial speedups up to 100x over multi-core CPUs. (B) Comparing MAF and NSF density estimators, MAF is typically 2-4 times faster than NSF during the training process.
Inference across different configurations of heterogeneous generative parameters in the whole-brain network model based on the Wilson-Cowan framework.
The set of unknown control parameters is . We observe accurate posterior estimation across different scenarios by varying the external input current map across brain regions. Features derived from the power spectrum density were used for training, with a simulation budget of 260k. The regions are color-coded based on their current values: blue indicates low, green medium, and red high external input current and the size of the regions reflects the shrinkage of the posterior distribution for the corresponding inferred parameters. Top: axial view. Bottom: sagittal view. The three configurations (from left to right) correspond to: randomly assigned heterogeneous Pi values, increased Pi for the five strongest nodes, and increased Pi for the five weakest nodes.
Bayesian inference on heterogeneous control parameters in the whole-brain network of Wilson-Cowan model, as Figure 3, but ignoring the spatial information of the data features.
Here, we observe that the MAF density estimator leads to a multimodal posterior distribution for low values of Pi, when trained with a reduced set of data features, suggesting increased uncertainty or potential non-identifiability under limited information.
Same as Figure S3, but using the NSF density estimator. Here, we observe more accurate estimation, however, NSF required approximately 6 hours of training compared to 20 minutes for MAF.
Inference on heterogeneous generative parameters in the whole-brain network of the Jansen-Rit model (Equation 2).
The set of inferred parameters is , consisting of the global scaling parameter G and the average number of synapses between neural populations Ci per region, with i∈{ 1, 2, …, Nn = 88 }parcelled regions. For panels A to C, only the area under the curve of the power spectral density (PSD) in the [0,30] Hz band was used, while for panels D to F, additional statistical features of the PSD were included in the training, with a total budget of 50k simulations. (A) and (D) illustrate the observed and predicted EEG signals. (B) and (E) show the observed and predicted PSD signals. (C) and (F) display the posterior distribution of Ci for each region. The true values are indicated by vertical green lines, and the prior distribution is shown in blue.
Bayesian inference on the spatial map of epileptogenicity across brain regions in the VEP model (Equation 4), with slow time scale separation, τ = 10 ms.
(A) The observed seizure envelope generated by the 2D Epileptor model given two regions as epileptogenic zones (in red) and three regions as propagation zones (in yellow), while the rest are healthy (in green). (B) The predicted seizure envelope, by training MAF density estimator on a dataset containing 10k simulations, using only the total power per region as the data features. (C) and (D) show the observed and predicted data features, respectively. (E) and (F) show the posterior distribution of heterogeneous ηi, and global coupling parameter G, respectively. (F) The posterior z-scores versus posterior shrinkages for estimated parameters.
Evaluating the performance of estimation over the global coupling parameter, under different levels of additive noise in the VEP model.
(A) Simulated time series under different levels of additive noise. (B) and (C) Posterior shrinkage and z-score, respectively, as function of noise level σ. (D) Ground-true (in green), prior (in blue), and posterior distributions (in red) for global coupling parameter. We observe that at higher noise levels (e.g., σ = 0.3 and σ = 0.6), the ground truth is not captured within the posterior, leading to increased z-scores compared to the noise-free case σ = 0.
Evaluating the accuracy of dynamical noise estimation in the VEP model.
(A) Simulated time series generated under different levels of dynamical noise. (B) and (C) Posterior shrinkage and z-score, respectively, across different scenarios, demonstrating the accuracy and reliability of estimating the dynamical noise parameter. (D) Ground-true (in green), prior (in blue) and posterior distributions (in red), for different realizations of the dynamical noise parameter σ in the system. We observe robust estimation even at high levels of dynamical noise, though for σ ≥ 1.6 the posterior sampling fails (shaded area).
Inference over different configurations of heterogeneous generative parameters in the whole-brain network model of the Montbrió, (see Figure 7).
The set of inferred parameters is , as the global scaling parameter and excitability per region, with i ∈ {1, 2, …, Nn = 88} parcelled regions.
Evaluating the reliability and accuracy of estimation (parameter G) with respect to the number of simulation used for training the MAF/NSF density estimator by (A) posterior shrinkage, and (B) posterior z-score, versus the number of simulations.
Scatter plot of summary statistics from the FC/FCD matrices in the whole-brain network of Montbrió model (Equation 5).
Inference on heterogeneous generative parameters in the whole-brain network of Montbrió model (Equation 5) involves estimating a parameter set denoted as from FC/FCD matrices.
Here, G is the global scaling parameter, and ηi represents the excitability of each brain region, where i ∈ {1, 2, …, Nn = 88} corresponds to the parcelled regions. Relying on only functional data features (FC/FCD) for training proves insufficient to accurately estimate the posterior distribution of heterogeneous excitabilities, even with a budget of 500k simulations. In (A), the posterior estimates are shown in red, the true values of ηi in green, and the prior distribution in blue. Panels (B) and (C) show the observed and predicted FC/FCD matrices, respectively, as the data features. The simulation to generate training data took approximately 24 hours, utilizing 10 GPUs concurrently. The training process lasted around 8 hours, while sampling required just a few seconds.
Inference of the global coupling parameter in the whole-brain network of Montbrió model (Equation 5) using functional data features, using a budget of only 100 simulations.
(A) The uniform prior distribution of G in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) The observed and predicted FC/FCD matrices, respectively.
Inference of the global coupling parameter in the whole-brain model of Montbrió (Equation 5) using TVB simulator, using summation of FC and FCD matrix elements as low dimensional data features, constrained by a budget of only 100 simulations.
(A) The uniform prior distribution of G in blue, the estimated posterior in red and the true value is shown in green. (B) and (C) show the observed and predicted FC/FCD matrices, respectively.
Evaluating the performance of estimation over the global coupling parameter under different levels of noise added to the BOLD signal in the Montbrió model.
(A) and (B) Simulated time series and corresponding FCD matrices, respectively, under different levels of additive noise. (C) and (D) Posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground-true (in green), prior (in blue) and posterior distribution (in red), for estimation of coupling parameter G, given different values of additive noise to the BOLD signals, indicating robust inference. However, for very high values of noise (σ ≥ 0.45) the posterior sampling fails (shaded area).
Evaluating the accuracy of dynamical noise estimation in the Montbrió model.
(A) and (B) Simulated time series and corresponding FCD matrices, respectively, under different levels of dynamical noise. (C) and (D) Posterior shrinkages and z-scores, respectively, for different noise values. (E) Ground-true (in green), prior (in blue) and posterior distributions (in red), for different realizations of the dynamical noise parameter σ in the system. We observe robust estimation even at high levels of dynamical noise.
The distribution of observed and predicted functional connectivity (FC) in panel and the functional connectivity dynamics in panel (B), which are demonstrated in Figure 8.
The non-identifiability in estimating G and J in the Wong-Wang model (Equation 6).
A total of 16k simulations and functional data features (see Figure S18) were used for training. The upper row illustrates the FCD variance and the sum of FC matrices plotted against the parameters G and J, respectively, for three selected observations marked by red circles. The lower panels display the parameter estimations corresponding to the selected observations. Due to structural non-identifiability in the product of G and J, the posterior distributions appear diffuse, showing a nonlinear dependency (banana shape) in the joint posterior plots.
Scatter plot of summary statistics from the FC/FCD matrices in the whole-brain network of Wong-Wang model (Equation 6).
Estimation of the global coupling parameter across whole-brain models, evaluated for multiple configurations (each represented by a violin plot).
The MAF density estimator was trained using the same features as described for the heterogeneous case in the main text. The simulation budget was limited to a maximum of 300. The green dashed line indicates an ideal estimation, while the gray dashed line represents a linear regression based on the mean values of the posterior sampling distributions. We can observe an accurate posterior estimation for coupling parameter across different models and configurations, except for the Jansen-Rit model, where coupling did not induce a significant change in the intrinsic frequency of regional activity. Note that the Montbrió model shows more diffuse estimations compared to the others, due to switching dynamics driven by noise.