The NeuroML ecosystem for standardized multi-scale modeling in neuroscience

Ankur Sinha; Padraig Gleeson; Bóris Marin; Salvador Dura-Bernal; Sotirios Panagiotou; Sharon Crook; Matteo Cantarelli; Robert C Cannon; Andrew P Davison; Harsha Gurnani; R Angus Silver

doi:10.7554/eLife.95135.2

Introduction

Development of an in-depth, mechanistic understanding of brain function in health and disease requires different scientific approaches spanning multiple scales, from gene expression to behavior. Although “wet” experimental approaches are essential for characterizing the properties of neural systems and testing hypotheses, theory and modeling are critical for exploring how these complex systems behave across a wider range of conditions, and for generating new experimentally testable, physically plausible hypotheses. Theory and modeling also provide a way to integrate a panoply of experimentally measured parameters, functional properties, and responses to perturbations into a physio-chemically coherent framework that reproduces the properties of the neural system of interest (Einevoll et al., 2019; Yao et al., 2022; Poirazi and Papoutsi, 2020; Gurnani and Silver, 2021; Gleeson et al., 2018; Cayco-Gajic et al., 2017; Billings et al., 2014; Vervaeke et al., 2010; Kriener et al., 2022; Billeh et al., 2020; Markram et al., 2015).

Computational models in neuroscience often focus on different levels of description. For example, a cellular physiologist may construct a complex multi-compartmental model to explain the dynamical behavior of an individual neuron in terms of its morphology, biophysical properties, and ionic conductances (Hay et al., 2011; De Schutter and Bower, 1994; Migliore et al., 2005). In contrast, to relate neural population activity to sensory processing and behavior, a systems neurophysiologist may build a circuit level model consisting of thousands of much simpler integrate-and-fire neurons (Lapicque, 1907; Potjans and Diesmann, 2014; Brunel, 2000). Domain specific tools have been developed to aid the construction and simulation of models at varying levels of biological detail and scales. An ecosystem of diverse tools is powerful and flexible, but it also creates serious challenges for the research community (Cannon et al., 2007). Each tool typically has its own design, features, Application Programming Interface (API) and syntax, a custom set of utility libraries, and finally, a distinct machine-readable representation of the model’s physiological components. This represents a complex landscape for users to navigate. Additionally, models developed in different simulators cannot be mixed and matched or easily compared, and the translation of a model from one tool-specific implementation to another can be non-trivial and error-prone. This fragmentation in modeling tools and approaches can act as a barrier to neuroscientists who wish to use models in their research, as well as impede how Findable, Accessible, Interoperable, and Reusable (FAIR) models are (Wilkinson et al., 2016).

To counter fragmentation and promote cooperation and interoperability within and across fields, standardization is required. The International Neuroinformatics Co-ordinating Facility (INCF) (Abrams et al., 2022) has highlighted the need for standards to “make research outputs machinereadable and computable and are necessary for making research FAIR” (INCF, 2023). In biology, several community standards have been developed to describe experimental data (e.g. Brain Imaging Data Structure (BIDS) (Gorgolewski et al., 2016), Neurodata Without Borders (NWB) (Teeters et al., 2015)) and computational models (e.g. Systems Biology Markup Language (SBML) (Hucka et al., 2003), CellML (Lloyd et al., 2004), Scalable Open Network Architecture TemplAte (SONATA) (Dai et al., 2020), PyNN (Davison et al., 2009) and Neural Open Markup Language (NeuroML) (Gleeson et al., 2010)). These standards have enabled open and interoperable ecosystems of software applications, libraries and databases to emerge, facilitating the sharing of research outputs, an endeavor encouraged by a growing number of funding agencies and scientific journals.

The initial version of the NeuroML standard, version 1 (NeuroMLv1), was originally conceived as a model description format (Goddard et al., 2001), and implemented as a three layered, declarative, modular, simulator independent language (Gleeson et al., 2010). NeuroMLv1 could describe detailed neuronal morphologies and their biophysical properties as well as specific instantiations of networks. It enabled the archiving of models in a standardized format and addressed the issue of simulator fragmentation by acting as the common language for model exchange between established simulation environments—NEURON (Hines and Carnevale, 1997; Awile et al., 2022), GENESIS (Bower and Beeman, 1997), and MOOSE (Ray and Bhalla, 2008). While solving a number of long-standing problems in computational neuroscience, NeuroMLv1 had several key limitations. The most restrictive of these was that the dynamical behavior of model elements was not formally described in the standard itself, making it only partially machine readable. Information on the dynamics of elements (i.e. how the state variables should evolve in time) was only provided in the form of human-readable documentation, requiring the developers of each new simulator to reimplement the behavior of these elements in their native format. Additionally, introduction of new model components required updates to the standard and all supporting simulators, making extension of the language difficult. Finally, the use of Extensible Markup Language (XML) as the primary interface language limited usability—applications would generally have to add their own code to read/write XML files.

To address these limitations, NeuroML was redesigned from the ground up in version 2 (NeuroMLv2) using the Low Entropy Modeling Specification (LEMS) language (Cannon et al., 2014). LEMS was designed to define a wide range of physio-chemical systems, enabling the creation of fully machine-readable, formal definitions of the structure and dynamics of any model components. Modeling elements in NeuroMLv2 (cells, ion channels, synapses) have their mathematical and structural definitions described in LEMS (e.g. the parameters required and how the state variables change with time). Thus, NeuroMLv2 retains all the features of NeuroMLv1—it remains modular, declarative, and continues to support multiple simulation engines—but unlike version 1, it is extensible, and all specifications are fully machine-readable. NeuroMLv2 also moved to Python as its main interface language and provides a comprehensive set of Python libraries to improve usability (Vella et al., 2014), with XML retained as a machine-readable serialization format (i.e. the form in which the model files are saved/shared).

Since its release in 2014, the NeuroMLv2 standard, the software ecosystem, and the community have all steadily grown. An open, community based governance structure was put in place— an elected Editorial Board, overseen by an independent Scientific Committee, maintains the standard and core software tools—APIs, reference simulators, and utilities. Although these tools were initially focused on enabling the simulation of models on multiple platforms, they have been expanded to support all stages of the model life cycle (Fig. 1). Modelers can use these tools to easily create, inspect and visualize, validate, simulate, fit and optimize, share and disseminate NeuroMLv2 models and outputs (Billings et al., 2014; Cayco-Gajic et al., 2017; Gurnani and Silver, 2021; Kriener et al., 2022; Gleeson et al., 2019). To provide clear, concise, searchable information for both users and developers, the NeuroML documentation has been significantly expanded and re-deployed using the latest modern web technologies (https://docs.neuroml.org). Increased community-wide collaborations have also extended the software ecosystem well beyond the NeuroMLv2 tools developed by the NeuroML team: additional simulators such as Brian (Stimberg et al., 2019), NetPyNE (DuraBernal et al., 2019), Arbor (Abi Akar et al., 2019) and EDEN (Panagiotou et al., 2022) all support NeuroMLv2. We have worked to ensure interoperability with other structured formats for model development in neuroscience such as PyNN (Davison et al., 2009) and SONATA (Dai et al., 2020). Platforms for collaboratively developing, visualizing, and sharing NeuroML models (Open Source Brain (OSB) (Gleeson et al., 2019)) as well as a searchable database of NeuroML model components (NeuroML Database (NeuroML-DB) (Birgiolas et al., 2023)) have been developed. These enhancements, driven by an ever expanding community, have helped NeuroMLv2 grow into a standard that has been officially endorsed by international organizations such as the INCF and COmputational Modeling in Biology NEtwork (COMBINE) (Hucka et al., 2015), and that is now sufficiently mature to be incorporated into a wide range of research workflows.

The NeuroML software ecosystem supports all stages of the model development life cycle.

In this paper, we provide an overview of the current scope of version 2 of the NeuroML standard, describe the current software ecosystem and community, and outline the extensive resources to assist researchers to incorporate NeuroML into their modeling work. We demonstrate, with examples, that NeuroML supports users at all stages of the model development life cycle (Fig. 1) and promotes FAIR principles in computational neuroscience. We highlight the various NeuroML tools and libraries, additional utilities, supported simulation engines, and the related projects that build upon NeuroML for automated model validation, advanced analysis, visualization, and sharing/reuse of models. Finally, we summarize the organizational aspects of NeuroML, its governance structure and community.

Results

NeuroML provides a ready to use set of curated model elements

A central aim of the NeuroML initiative is to enable and encourage the use of multi-scale biophysically detailed models of neurons and neuronal circuits in neuroscience research. The initiative takes a range of steps to achieve this aim. NeuroML provides users with a curated library of model elements that form the NeuroML standard (An index of all the model elements included in version 2.3 of NeuroML, with links to further online documentation, is provided in Appendix 1 Tables 5 and 6.) (Fig. 2). The standard is maintained by the NeuroML Editorial Board that has identified a fundamental set of model types to support, to ensure that a significant proportion of commonly used neurobiological modeling entities can be described with the language. This includes (but is not limited to): active membrane conductances (using HodgkinHuxley style (Hodgkin and Huxley, 1952) or kinetic scheme based ionic conductances), multiple synapse and plasticity mechanisms, detailed multi-compartmental neuron models with morphologies and biophysical properties, abstract point neuron models, and networks of such cells spatially arranged in populations, connected by targeted projections, receiving spiking and current based inputs.

NeuroML is a modular, hierarchical format that supports multi-scale modeling. Elements in NeuroML are formally defined, independent, self-contained building blocks with hierarchical relationships between them. (a) Models of ionic conductances can be defined as a composition of gates, each with specific voltage (and potentially [Ca²⁺]) dependence that controls the conductance. (b) Morphologically detailed neuronal models specify the 3D structure of the cells, along with passive electrical properties, and reference ion channels that confer membrane conductances. (c) Network models contain populations of these cells connected via synaptic projections. (d) A truncated illustration of main categories of the NeuroMLv2 standard elements and their hierarchies. The standard includes commonly used model elements/building blocks that have been pre-defined for users: Cells: neuronal models ranging from simple spiking point neurons to biophysically detailed cells with multi-compartmental morphologies and active membrane conductances; Synapses and ionic conductance models: commonly used chemical and electrical synapse models (gap junctions), and multiple representations for ionic conductances; Inputs: to drive cell and network activity, e.g. current or voltage clamp, spiking background inputs; Networks: of populations (containing any of the aforementioned cell types), and projections. The full list of standard NeuroML elements can be found in Appendix 1 Tables 5 and 6.

The NeuroMLv2 standard consists of two levels that are designed to enable users to easily create their models without worrying about simulator-specific details. The first level defines a formal “schema” for the standard model elements, their attributes/parameters (e.g. an integrate and fire cell model and its necessary attributes: a threshold parameter, a reset parameter, etc.), and the relationships between them (e.g. a network contains populations; a multi-compartmental cell morphology contains segments). This allows the validation of the completeness of the description of individual NeuroML model elements and models, prior to simulation. The second level defines the underlying dynamical behavior of the model elements (e.g. how the time varying membrane potential of a cell model is to be calculated). Most users do not need to interact with this level (which is enabled by LEMS), which, among other features, enables the automated translation of simulator-independent NeuroML models into simulator-specific code.

Thus, modelers can use the standard NeuroML elements to conveniently build simulator-independent models, while also being able to examine and extend the underlying implementations of models. As a simulator independent language, NeuroML also promotes interoperability between different computational modeling tools, and as a result, the standard library is complemented by a large, well maintained ecosystem of software tools that supports all stages of the model life cycle—from creation, analysis, simulation and fitting, to sharing and reuse. Finally, as discussed in later sections, for advanced use cases where the existing NeuroML model building blocks are insufficient, NeuroML also includes a framework for creating and including new model elements.

NeuroML is a modular, structured language for defining FAIR models

NeuroMLv2 is a modular, structured, hierarchical, simulator independent format. All NeuroML elements are formally defined, independent and self-contained with hierarchical relationships between them. An “ionic conductance” model element in NeuroML, for example, can contain zero, one or more “gates” and be added into a “cell” model element along with a “morphology” element, which can then fit into a “population” of a “network” (Fig. 2). To support the range of electrical properties found in biological neurons, ionic conductances with distinct ionic selectivities and dynamics can be generated in NeuroML through the inclusion of different types of gates (e.g. activation, inactivation), their dependence on variables such as voltage and [Ca²⁺] and their reversal potential. Cell types with different functional and biophysical properties can then be generated by conferring combinations of ionic conductances on their membranes. The conductance density can be adjusted to generate the electrophysiological properties found in real neurons. In practice, many examples of ionic conductances that underlie the electrical behavior of neurons are already available in NeuroMLv2 and can simply be inserted into a cell membrane (Fig. 2). Indeed, a model element, once defined in NeuroML, acts as a building block that may be reused any number of times within or across models. Elements such as ionic conductances, cell biophysics, cell morphologies, and cell definitions that incorporate them can be serialized in separate files and “included” in other models (e.g. morphologies (https://docs.neuroml.org/Userdocs/ImportingMorphologyFiles.html#neuroml2)). Such reuse of model components speeds model construction and prototyping irrespective of the simulation engine used.

The defined structure of each model element and the relationships between them inform users of exactly how model elements are to be created and combined. This encourages the construction of well-structured models, reduces errors and redundancy and ensures that FAIR principles are firmly embedded in NeuroML models and the ecosystem of tools. As we will see in the following sections, NeuroML’s formal structure also enables features such as model validation prior to simulation, translation into simulation specific formats, and the use of NeuroML as a common language of exchange between different tools.

NeuroML supports a large ecosystem of software tools that cover all stages of the model life cycle

Model building and the generation of scientific knowledge from simulation and analysis of models is a multi-step, iterative process requiring an array of software tools. NeuroML supports all stages of the model development life cycle (Fig. 1), by providing a single model description format that interacts with a myriad of tools throughout the process. Researchers typically assemble ad-hoc sets of scripts, applications, and processes to help them in their investigations. In the absence of standardization, they must work with the specific model formats and APIs that each tool they use requires, and somehow convert model descriptions when using multiple applications in a toolchain. NeuroML addresses this issue by providing a common language for the use and exchange of models and their components between different simulation engines and modeling tools. The NeuroML ecosystem includes a large collection of software tools, both developed and maintained by the main NeuroML contributors (the “core NeuroML tools and libraries”: jNeuroML, pyNeuroML, APIs) and those external applications that have added NeuroML support (Fig. 3, Fig. 4a, Appendix 1 Tables 1 and 2).

The ecosystem of NeuroML complaint tools and their relation to the model life cycle. The inner circle shows the core NeuroML tools and libraries that are maintained by the NeuroML developers. These provide functionality to read, modify, or create new NeuroML models, as well as to validate, analyze, visualize and simulate the models. The outermost layer shows NeuroML compliant tools that have been developed independently to allow various interactions with NeuroML models. These complement the core tools by facilitating model creation, validation, visualization, simulation, fitting/optimization, sharing and reuse. Further information on each of the tools shown here can be found in Appendix 1 Tables 1 and 2.

The core NeuroML software stack, and an example NeuroML model created using the Python NeuroML tools. (a) The core NeuroML software stack consists of Java (blue) and Python (orange) based applications/libraries, and the LEMS model ComponentType definitions (green), wrapped up in a single package, pyNeuroML. Each of these modules can be used independently or the whole stack can be obtained by installing pyNeuroML with the default Python package manager, Pip: pip install pyneuroml. (b) An example of how to create a simple NeuroML model is shown, using the NeuroMLv2 Python API (libNeuroML) to describe a model consisting of a population of 10 integrate and fire point neurons (IafTauCell) in a network. The IafTauCell, Network, Population, and NeuroMLDocument model ComponentTypes are provided by the NeuroMLv2 standard. The underlying dynamics of the model are hidden from the user, being specified in the LEMS ComponentType definitions of the elements (see Methods). The simulator-independent NeuroML model description can be simulated on any of the supported simulation engines. (c) XML serialization of the NeuroMLv2 model description shows the correspondence between the Python object model and the XML serialization.

Step by step guides for using NeuroML, illustrating the various stages of the model development life cycle. These include Introductory guides aimed at teaching the fundamental NeuroML concepts, Advanced guides illustrating specific modeling workflows, and Walkthrough guides discussing the steps required for converting models to NeuroML. An updated list is available at http://neuroml.org/gettingstarted.

The core NeuroML tools and libraries include APIs in several programming languages—Python, Java, C++ and MATLAB. These tools provide critical functionality to allow users to interact with NeuroML components and build models. Using these, researchers can build models from scratch, or read, modify, analyze, visualize, and simulate existing NeuroML models on supported simulation platforms. Furthermore, developers can also use the core tools, libraries and APIs to support NeuroML in their own applications.

The simulation platforms (e.g. EDEN (Panagiotou et al., 2022), NEURON (Hines and Carnevale, 1997)), along with other independently developed tools, form the next layer of the software ecosystem—providing extra functionality such as interactive model construction (e.g. neuroConstruct (Gleeson et al., 2007), NetPyNE (Dura-Bernal et al., 2019)), additional visualization (e.g. OSB (Gleeson et al., 2019)), analysis (e.g. NeuroML-DB (Birgiolas et al., 2023)), data-driven validation (e.g. SciUnit (Gerkin et al., 2019)), and archival/sharing (e.g. OSB, NeuroML-DB). Indeed, OSB and NeuroML-DB are prime examples of how advanced neuroinformatics resources can be built on top of standards such as NeuroML.

Table 1 lists interactive, step-by-step guides in the NeuroML documentation, which can be followed to learn the fundamental NeuroML concepts, as well as illustrate how NeuroML compliant tools can be used to achieve specific tasks across the model development life cycle. In the following sections we discuss the specific functionality available at each stage of model development.

Creating NeuroML models

The structured declarative elements of NeuroMLv2, when combined with a procedural scripting language such as Python, provide a powerful and yet intuitive “building block” approach to model construction. For this reason, Python is now the recommended language for interacting with NeuroML (Fig. 4), although XML remains the primary serialization language for the format (i.e. for saving to disk and depositing in model repositories (Fig. 5)). Python has emerged as a key programming language in science, including many areas of neuroscience (Muller et al., 2015). A Python based NeuroML ecosystem ensures that users can take advantage of Python’s features, and also use packages from the wider Python ecosystem in their work (e.g. Numpy (Harris et al., 2020), Matplotlib (Hunter, 2007)). pyNeuroML, the Python interface for working with NeuroML, is built on top of the Python NeuroML API, libNeuroML (Vella et al., 2014; Sinha et al., 2023a,b) (Fig. 4).

Workflow showing how to create and simulate NeuroML models using Python. The Python API can be used to create models which may include elements built from scratch from the NeuroML standard, re-use elements from previously created models, or create new components based on novel model definitions expressed in LEMS (red). The generated model elements are saved in the default XML-based serialization (blue). The NeuroML core tools and libraries (orange) include modules to import model descriptions expressed in the XML serialization, and support multiple options for how simulators can execute these models (green). These include: (1) execution of the NeuroML models by reference simulators; (2) execution by other independently developed simulators that natively support NeuroML, such as EDEN; (3) generation of Python “import scripts” which allow NeuroML models to be imported (and converted to internal formats) by simulators which support this; (4) fully expanding the LEMS description of the models, which can be mapped to generated simulator specific scripts for target simulators; (5) mapping to other standardized formats in neuroscience and systems biology.

As illustrated in Fig. 5, Python can be used to combine different NeuroML components into a model. NeuroML supports several pathways for the creation of new models. Modelers may use elements included in the NeuroML standard, re-use user-defined NeuroML model elements from other models or define completely new model elements using LEMS (Fig. 5) (see section on extending NeuroML below). It is common for models to use a combination of these strategies, e.g. Gurnani and Silver (2021); Kriener et al. (2022); Cayco-Gajic et al. (2017), highlighting the flexibility provided by the modular design of NeuroML. NeuroML APIs support all of these workflows. The Python tools also include many additional higher level utilities to speed up model construction, such as factory functions, type hints, and convenience functions for building complex multi-compartmental neuron models (Box 1).

For construction of complex 3D circuit models, or for users who are not experienced with Python, a range of NeuroML compliant online and standalone applications with graphical user interfaces are available. These include NetPyNE’s interactive web interface (Dura-Bernal et al., 2019) (which is available on the latest version of OSB (https://v2.opensourcebrain.org)) and neuroConstruct (Gleeson et al., 2007) which can export models directly into NeuroML and LEMS. These applications can be used to build and simulate new NeuroML models without requiring programming. Thus, users can take advantage of the individual features provided by these applications to generate NeuroML compliant models and model elements.

Validating NeuroML models

Ensuring a model is “valid” can have different meanings at different stages of the life cycle—from checking whether the source files are in the correct format, to ensuring the model reproduces a significant feature of its biological counterpart. NeuroML’s hierarchical, well-defined structure allows users to check their model descriptions for correctness at multiple levels (Fig. 6), in a manner similar to multi-level testing in software development. Importantly, most of the validation tests in NeuroML are run on the models’ NeuroML descriptions prior to simulation.

NeuroML model development incorporates multi-level validation of models. Checks are performed on the model descriptions (blue) before simulation using validation at both the NeuroML and LEMS levels (green). After the models are simulated (yellow), further checks can be run to ensure the output is in line with expected behavior (brown). The OSB Model Validation (OMV) framework can be used to ensure consistent behavior across simulators, and comparisons can be made of model activity to their biological equivalents using SciUnit.

A first level of validation checks the structure of individual model elements against their formal specifications contained in the NeuroML standard. The standard includes information on the parameters of each model element, restrictions on parameter values, their allowed units, their cardinality, and the location of the model element in the model hierarchy—i.e. parent/children relationships. A second level of validation includes a suite of semantic and logical checks. For example, at this level, a model of a multi-compartmental cell can be checked to ensure that all segments referenced in segment groups (e.g. the group of dendritic segments) have been defined, and only defined once with unique identifiers. A list of validation tests currently included in the NeuroML core tools can be found in Appendix 1 Table 4. These can be run against NeuroML files at the command line or programmatically in Python—Box 1.

A key advantage of using the NeuroML2/LEMS framework is that dimensions and units are inbuilt into LEMS descriptions. This enables automated conversions of units, unit checking, together with the validation of equations. Any expressions in models which are dimensionally inconsistent will be highlighted at this stage. Note that LEMS handles unit conversions internally—modelers have flexibility in how they enter the units of parameter values (e.g. specifying conductance density in S∕m² or mS∕cm²) in the NeuroML files, with the underlying LEMS definitions ensuring that a consistent set of dimensions are used in model equations (Cannon et al., 2014). LEMS then takes care of mapping the entered units to the target simulator’s preferred units. This makes model definition, inspection, use, extension, and translation easier and less error-prone.

Once the set of NeuroML files are validated, the model can be simulated, and checks can be made to test whether execution produces consistent results (e.g. firing rate of neurons in a given population) across multiple simulators (or versions of the same simulator). For this, the OSB Model Validation (OMV) framework has been developed (Gleeson et al., 2019). This framework can automatically check that the output (e.g. spike times) of a NeuroML model running on a given simulator is within an allowed tolerance of the expected value. OMV has been applied to NeuroML models that have been shared on OSB, to test consistent behavior of models as the models themselves, and all supported simulators, are updated. This has proven to be a valuable process for ensuring uniform usage and interpretation of NeuroML across the ecosystem of supporting tools.

A final level of validation concerns checking whether the model elements have emergent features that are in line with experimentally observed behavior of the biological equivalents. NeuronUnit (Gerkin et al., 2019), a SciUnit (Omar et al., 2014) package for data-driven unit testing and validation of neuronal and ion channel models, is also fully NeuroML compliant, and also supports automated validation of NeuroML models shared on NeuroML-DB and OSB.

Visualizing/analyzing NeuroML models

Multiple visualization, inspection, and analysis tools are available in the NeuroML software ecosystem. Since NeuroML models have a fixed, well defined structure, NeuroML libraries can extract all information from their descriptions. This information can be used by modelers and their programs/tools to run automated programmatic analyses on models.

Box 1.

NeuroML Python tools for users

PyNeuroML provides Python functions and command line utilities.

pyNeuroML includes a range of ready-made inspection utilities for users (Box 1) that can be used via Python scripts, interactive Jupyter Notebooks, and command line tools. Examining the structure of cell and network models with 2D and 3D views is important for manual validation and to compare them to their biological counterparts. Graphical views of cell model morphology and the 3-dimensional network layout (Fig. 7), population and connectivity matrices/graphs at different levels (Fig. 8), and model summaries can all be generated (Fig. 9). In addition to these inspection functions, a variety of utilities for the inspection of NeuroML descriptions of electrophysiological properties of membrane conductances and their spatial distribution over the neuronal membrane are also provided (Fig. 9).

Visualization of detailed neuronal morphology of neurons and networks together with their functional properties (results from model simulation) enabled by NeuroML. (a) interactive 3-D (VisPy (Campagnola et al., 2023) based) visualization of an olfactory bulb network with detailed mitral and granule cells (Migliore et al., 2014), generated using pyNeuroML. (b) Visualization of an inhibition stabilized network based on Sadeh et al. (2017) using Open Source Brain (OSB) version 1 (Gleeson et al., 2019). (c) Visualization of 3D network of simplified multi-compartmental cortical neurons (from Traub et al. (2005), imported as NeuroML) and simulated spiking activity using NetPyNE’s GUI (Dura-Bernal et al., 2019), which is embedded in OSB version 2.

Analysis and visualization of network connectivity from NeuroML model descriptions *prior to simulation*. Network connectivity schematic (a) and connectivity matrix (b) for a half scale implementation of the human layer 2/3 cortical network model (Yao et al., 2022) generated using pyNeuroML.

Examples of visualizing biophysical properties of a NeuroML model neuron. (a) Electrophysiological properties generated by the NeuroML-DB web based platform (Birgiolas et al., 2023). (Plots show four superimposed voltage traces in the top panel and corresponding current injection traces below). (b) Example plots of steady states of activation (na_channel na_m inf) and inactivation (na_channel na_h inf) variables and their time courses (na_channel na_m tau and na_channel na_h tau) for the Na channel from the classic Hodgkin Huxley model (Hodgkin and Huxley, 1952). (c) The distribution of the peak conductances for the Ih channel over a layer 5 Pyramidal cell (Hay et al., 2011). Both (b) and (c) were generated using the analysis features in pyNeuroML, and similar functionality is also available in OSBv1 (Gleeson et al., 2019).

The graphical applications included in the NeuroML ecosystem (e.g. neuroConstruct, NeuroML-DB, OSB (v1 (https://v1.opensourcebrain.org) and v2), NetPyNE, and Arbor-GUI) also provide many of their own analysis and visualization functions. OSBv1, for example, supports automated 3D visualization of networks and cell morphologies, network connectivity graphs and metrics, and advanced model inspection features (Gleeson et al., 2019) (Fig. 7b). On OSBv2, NetPyNE provides advanced graphical plotting and analysis facilities (Fig. 7c). A complete JupyterLab (https://jupyter.org/) interface is also included in OSBv2 for Python scripting, allowing interactive notebooks to be created and shared, mixing scripting and graphical elements, including those generated by pyNeuroML. NeuroML-DB also provides information on electrophysiology, morphology, and the simulation aspects of neuronal models (Birgiolas et al., 2023) (Fig. 9a). In general, any NeuroML compliant application can be used to inspect and analyze elements of NeuroML models, each having their own distinct advantages.

Simulating NeuroML models

Users can simulate NeuroML models using a number of simulation engines without making any changes to their models. This is because the NeuroML/LEMS descriptions of the models are simulator independent and can be translated to simulator specific formats. pyNeuroML facilitates access to all available simulation options, both from the command line and using function calls in Python scripts when using the Python API (Box 1).

Simulation engines can be classified into five broad categories (Fig. 5):

reference NeuroML/LEMS simulators.
independently developed simulators that natively support NeuroML.
simulators that import/translate NeuroML to their own internal formats.
simulators that are supported through generation of simulator-specific scripts by the core NeuroML tools.
export to other standardized formats which may allow simulation/analysis in other packages.

Each simulation engine supports a different set of features that NeuroML users can take advantage of (Table 3). For example, the reference NeuroML and LEMS simulators, jNeuroML, jLEMS, and PyLEMS, can simulate all LEMS models and most NeuroML models. They cannot, however, simulate multi-compartmental models, and users should opt for a simulator that does, e.g. NEURON (Hines and Carnevale, 1997) or EDEN (Panagiotou et al., 2022).

Another criteria that is relevant when choosing a simulation engine is the efficiency of simulation. Simulation engines implement different computing techniques—e.g. NetPyNE, Arbor, and EDEN support parallel execution on clusters and super computers via Message Passing Interface (MPI)—to enable simulation of large scale models. Thus, for efficient large scale simulation, users may prefer one of these simulation engines.

The preferred programming language for working with NeuroML is Python (Muller et al., 2015). A Python based ecosystem ensures that automated simulation of models can easily be carried out either using scripts, or the command line tools. Utilities to enable the execution of simulations on dedicated supercomputing resources, such as the Neuroscience Gateway (NSG) (http://www.nsgportal.org/) (Sivagnanam et al., 2013) are also available within the ecosystem. OSBv1 takes advantage of these to support the submission of NeuroML model simulation jobs using the NEURON simulator on NSG. NetPyNE also includes parallel execution of simulations, batch processing and parameter exploration features, and its deployment on OSBv2 allows users to easily access these features on a scalable, cloud based platform. Finally, the JupyterLab environment on OSBv2 contains all of the core NeuroML tools and various simulation engines as pre-installed software packages, ready to use.

Optimizing NeuroML models

Development of biologically detailed models of brain function requires that components and emergent properties match the behavior of the corresponding biology as closely as possible. Thus, fitting neurons and networks to experimental data is a critical step in the model life cycle (Rossant et al., 2011; Druckmann et al., 2007). pyNeuroML promotes data-driven modeling by providing functions to fit and optimize NeuroML models against experimental data. It includes the NeuroMLTuner module (https://pyneuroml.readthedocs.io/en/development/pyneuroml.tune.html), which builds on the Neurotune package (https://github.com/NeuralEnsemble/neurotune) for tuning and optimizing NeuroML models against data using evolutionary computation techniques. This module allows users to select a set of weighted features from their data to calculate the fitness of populations of candidate models. In each generation, the fittest models are found and mutated to create the next generation of models, until a set of models that best exhibit the selected data features are isolated (see Guide 6 in Table 1) (https://docs.neuroml.org/Userdocs/OptimisingNeuroMLModels.html).

The NeuroML ecosystem includes multiple tools that also provide model fitting features. The Blue Brain Python Optimisation Library (BluePyOpt) (Van Geit et al., 2016), an extensible framework for data-driven model parameter optimization, supports exporting optimized models to NeuroML files (https://github.com/BlueBrain/BluePyOpt/blob/master/examples/neuroml/neuroml.ipynb). Similar to pyNeuroML, NetPyNE also uses the inspyred Python package (https://github.com/aarongarrett/inspyred) to provide evolutionary computation based model optimization features (Dura-Bernal et al., 2019).

Sharing NeuroML models

The NeuroML ecosystem includes the advanced web based model sharing platforms NeuroMLDB (Birgiolas et al., 2023) (https://neuroml-db.org) and OSB (Gleeson et al., 2019). These resources have been designed specifically for the dissemination of models and model elements standardized in NeuroML. The OSB platform also supports visualization, analysis, simulation, and development of NeuroML models. Researchers can create shared, collaborative NeuroML projects on it and can take advantage of the in-built automated visualization and analysis pipelines to explore and re-use models and their components. Whereas version 1 (OSBv1) focused on providing an interactive 3D interface for running pre-existing NeuroML models (e.g. sourced from linked GitHub repositories) (Gleeson et al., 2019), OSBv2 provides cloud based workspaces for researchers to construct NeuroML based computational models as well as analyze, and compare them to, the experimental data on which they are based, thus facilitating data-driven computational modeling. Appendix 1 Table 7 provides a list of stable, well tested NeuroML compliant models from brain regions including the neocortex, cerebellum and hippocampus, which have been shared on OSB.

NeuroML-DB aims to promote the uptake of standardized NeuroML models by providing a convenient location for archiving and exploration. It includes advanced database search functions, including ontology based search (Birgiolas et al., 2015), coupled with pre-computed analyses on models’ electrophysiological and morphological properties, as well as an indication of the relative speed of execution of different models.

NeuroML’s modular nature ensures that models and their components can be easily shared with others through standard code sharing resources. The simplest way of sharing NeuroML models and components is to make their Python descriptions or their XML serializations available through these resources. Indeed, it is straightforward to make Python descriptions or the XML serializations available via different file, code (GitHub, GitLab), model sharing (ModelDB (Migliore et al., 2003; McDougal et al., 2017)), and archival (Zenodo, Open Science Framework) platforms, just like any other code/data produced in scientific investigations. Complex models with many components, spanning multiple files, such as networks and neuronal models that reference multiple cell and ionic conductance definitions, can also be exported into a COMBINE zip archive (Bergmann et al., 2014), a zip file that includes metadata about its contents. pyNeuroML includes functions to easily create COMBINE archives from NeuroML models and simulations (Box 1).

OSB is designed so that researchers can share their code on their chosen platform (e.g. GitHub), while retaining full control over write access to their repositories. Afterwards, a page for the model can be created on OSB which lists the latest files present there, with links to OSB visualization/analysis/simulation features which can use the standardized files found in the resource.

NeuroML supports the embedding of structured ontological information in model descriptions (Neal et al., 2018). Models can include NeuroLex (now InterLex) (Larson and Martone, 2013) identifiers for their components (e.g. neuro_lex_id in Box 1). This links model components to their biological counterparts and makes them more transparent, findable, and reusable. For example, different types of neurons and brain regions have unique ontological ids. A user can use these ids to search for relevant model components on NeuroML-DB. More general information to maintain provenance can also be included in NeuroML models (https://docs.neuroml.org/Userdocs/Provenance.html).

Reusing NeuroML models

NeuroML models, once openly shared, become community resources that are accessible to all. Researchers can use models shared on NeuroML-DB and OSB without restrictions. Guide 5 in Table 1 provides an example of finding NeuroML based model components using the API of NeuroML-DB, and creating novel models incorporating these elements.

In addition to these platforms, other experimental data and model dissemination platforms also provide standardized NeuroML versions of relevant models to promote uptake and reuse. For example, NeuroMorpho.org (Ascoli et al., 2007) includes a tool to download NeuroML compliant versions of its cellular reconstructions^15,16. NeuroML versions of models released by organizations such as the Blue Brain Project (Markram et al., 2015) (whole cell models as well as ion channel models from Channelpedia (Ranjan et al., 2011)), the Allen Institute for Brain Science (Billeh et al., 2020), and the OpenWorm project (Gleeson et al., 2018) are also openly available for reuse (Appendix 1 Table 7).

NeuroML can also interact with other standards to further promote model re-use. Whereas NeuroML is a declarative standard, PyNN (Davison et al., 2009) is a procedural standard with a Python API for creating network models that can be simulated on multiple simulators. NeuroML models which are within the scope of PyNN can be converted to the PyNN format, and vice-versa. Similarly, NeuroML also interacts with SONATA (Dai et al., 2020) data format by supporting the two way conversion of the network structures of NeuroML models into SONATA. In standards not specific to neuroscience, models from the well established SBML standard (Hucka et al., 2003) can be converted to LEMS (Cannon et al., 2014), for inclusion in neuroscience related modeling pipelines, and a subset of NeuroML/LEMS models can be exported to SBML, which allows use with simulators and analysis packages compliant to this standard, e.g. Tellurium (Choi et al., 2018). Simulation execution details of NeuroML/LEMS models can also be exported to Simulation Experiment Description Markup Language (SED-ML) (Waltemath et al., 2011), allowing advanced resources such as Biosimulators (https://biosimulators.org) (Shaikh et al., 2022) to feature NeuroML models.

NeuroML is extensible

While the standard NeuroML elements (Appendix 1 Tables 5 and 6) provide a broad range of curated model types for simulation based investigations, NeuroML can be extended (using LEMS) to incorporate novel model elements and types when they are not (yet) available in the standard.

LEMS is a general purpose model specification language for creating fully machine readable definitions of the structure and behavior of model elements (Cannon et al., 2014). The dynamics of NeuroML elements are described in LEMS. The hierarchical nature of LEMS means that new elements can build on pre-existing elements of the modular NeuroML framework. For example, a novel ionic conductance element can extend the “ionChannelHH” element, which in turn extends “baseIonChannel”. Thus, the new element will be known to the NeuroML elements as depending on an external voltage and producing a conductance, properties that are inherited from “baseIonChannel”. Other elements, such as a cell, can incorporate this new type without needing any other information about its internal workings.

LEMS (and therefore NeuroML) element definitions (called “ComponentTypes”) specify the dynamical behavior of the model element in terms of a list of yet to be set parameters. Once the generic model behavior is defined, modelers only need to fill in the appropriate values of the required parameters (e.g. conductance density, reversal potential, etc.) to create new instances (called “Components”) of the element (see Methods for more details). Users can therefore create arbitrary, reusable model elements in LEMS, which can be treated the same way as the standard model elements (for an example see Guide 7 in Table 1).

Another major advantage of NeuroML’s use of the LEMS language is its translatability. Since LEMS is fully machine readable, its primitives (e.g. state variables and their dynamics, expressed as ordinary differential equations) can be readily mapped into other languages. As a result, simulator specific code (Blundell et al., 2018) can be generated from NeuroML models and their LEMS extensions (Fig. 5), allowing NeuroML to remain simulator-independent while supporting multiple simulation engines.

Newly created elements that may be of interest to the wider research community can be submitted to the NeuroML Editorial Board for inclusion into the standard. The standard, therefore, evolves as new model elements are added, and improved versions of the standard and associated software tool chain are regularly released to the community.

NeuroML is a global open community initiative

NeuroML is a global open community standard that is used and maintained collectively by a diverse set of stakeholders. The NeuroML Scientific Committee (https://docs.neuroml.org/NeuroMLOrg/ScientificCommittee.html) and the elected NeuroML Editorial Board (https://docs.neuroml.org/NeuroMLOrg/Board.html) oversee the standard, the core tools, and the initiative. This ensures that NeuroML supports the myriad of use cases generated by a multi-disciplinary computational modeling community.

NeuroML is an endorsed INCF (Abrams et al., 2022) community standard (Martone et al., 2019) and is one of the main standards of the international COMBINE initiative (Hucka et al., 2015), which supports the development of other standards in computational biology as well (e.g. SBML (Hucka et al., 2003) and CellML (Lloyd et al., 2004)). Participation in these organizations guarantees that NeuroML follows current best practices in standardization, and remains linked to and interoperable with other standards wherever possible. The NeuroML community also participates in training and outreach activities such as Google Summer of Code (https://docs.neuroml.org/NeuroMLOrg/OutreachTraining.html), tutorials, and internships under these and other organizations.

The NeuroML community maintains public open communication channels to ensure that all community members can easily participate in troubleshooting, discussions, and development activities. A public mailing list (https://lists.sourceforge.net/lists/listinfo/neuroml-technology) is used for asynchronous communication and announcements while open chat channels on Gitter (now Matrix/Element (https://matrix.to/#/#NeuroML_community:gitter.im)) provide immediate access to the NeuroML community. All software repositories hosted on GitHub also have issue trackers for software specific queries. A community Code of Conduct (https://docs.neuroml.org/NeuroMLOrg/CoC.html) sets the standards of communication and behavior expected on all community channels.

A crucial aim of NeuroML is to enable Open Science and ensure models in computational neuroscience are FAIR. To this end, all development and discussions related to NeuroML are done publicly. The schema, all core software tools, and relevant resources such as documentation are made freely available under suitable Free/Open Source Software (FOSS) licenses on public platforms. Everyone can, therefore, use, modify, study, and share all NeuroML artifacts without restriction. Users and developers are encouraged to contribute modifications and improvements to the schema and core tools and to participate in the general maintenance and release process.

Discussion

The model description language NeuroMLv2 has matured into a widely adopted community standard for computational neuroscience. Its modular, hierarchical structure can define a wide range of neuronal and circuit model types including simplified representations and those with a high degree of biological detail. The standardized, machine readable format of the NeuroMLv2/LEMS framework provides a flexible, common language for communicating between a wide range of tools and simulators used to create, validate, visualize, analyze, simulate, share and reuse models. By enabling this interoperability, NeuroMLv2 has spawned a large ecosystem of interacting tools that cover all stages of the model development life cycle, bringing greater coherence to a previously fragmented landscape. Moreover, the modular nature of the model components and hierarchical structure conferred by NeuroMLv2, combined with the flexibility of coding in Python, has created a powerful “building block” approach for constructing standardized models from scratch. NeuroML has therefore evolved from a standardized archiving format into a mature language that supports an ecosystem of tools for the creation and execution of models that support the FAIR principles and promote open, transparent and reproducible science.

Evolution of NeuroML and emergence of the NeuroMLv2 tool ecosystem

NeuroML was conceived (Goddard et al., 2001) and developed (Gleeson et al., 2010) as a declarative XML based framework for defining biophysical models of neurons and networks in a standardized form in order to compare model properties across simulators and to promote transparency and reuse. NeuroML version 1 achieved these aims and was mainly used to archive and visualize existing models (Gleeson et al., 2010). Building on this, the subsequent development of the NeuroMLv2/LEMS framework provided a way to describe models as a hierarchical set of components with dimensional parameters and state variables, so that their structure and dynamics are fully machine readable (Cannon et al., 2014). This enabled models to be losslessly mapped to other representations, greatly promoting interoperability between tools through read-write and automated code generation (Blundell et al., 2018). As NeuroMLv2 matured and became a community standard recognized by the INCF with a formal governance structure, an increasingly wide range of models and modeling tools have been developed or modified to be NeuroMLv2 compliant (Appendix 1 Tables 1, 2 and 7). The core tools, maintained directly by the NeuroML developers (Fig. 4), provide functionality to read, modify, or create new NeuroML models, as well as to analyze and visualize, and simulate the models. Furthermore, there are now a larger number of tools that have been developed by other members of the community (Fig. 3), including a neuronal simulator designed specifically for NeuroMLv2 (Panagiotou et al., 2022). The emergence of an ecosystem of NeuroMLv2 compliant tools enables modelers to build tool chains that span the model life cycle and build and reuse standardized models.

NeuroML and other standards in computational neuroscience

Several other standards and formats exist to support computational modeling of neuronal systems. Whereas NeuroML is a modular, declarative simulator independent standard for biophysical neuronal modelling, PyNN (Davison et al., 2009) and SONATA (Dai et al., 2020) provide a procedural Python based simulator independent API and a framework for efficiently handling large scale network simulations, respectively. Even though there is some overlap in the functionality provided by these standards, they each target distinct use cases and have their own goals and features. The teams developing these standards work in concert to ensure that they remain interoperable with each other, frequently sharing methods and techniques (Dai et al., 2020). This allows researchers to use their standard of choice and easily combine with another if the need arises. PyNN and SONATA are therefore integral parts of the wider NeuroML ecosystem.

Why using NeuroML and Python promotes the construction of FAIR models

The modular and hierarchical structure of NeuroMLv2, when combined with Python, provides a powerful combination of structured declarative elements and flexible procedural approaches that enables a “Lego-like” building block approach for constructing biologically detailed models (CaycoGajic et al., 2017; Billings et al., 2014; Kriener et al., 2022; Gurnani and Silver, 2021). This has been advanced by the development of pyNeuroML, which provides a single installable package which offers direct access to a range of functionality for handling NeuroML models (Box 1). Moreover, the web-based documentation of NeuroMLv2, with multiple Python scripts illustrating the usage of the language and associated tools (Table 1), has recently been updated and expanded (https://docs.neuroml.org). This provides a central resource for both new and experienced users of NeuroML supporting its use in model building. As the examples of this resource illustrate, building models using NeuroMLv2 is efficient and intuitive, as the model components are pre-made and how they fit together specified. The structured format allows APIs like libNeuroML to incorporate features such as auto-completion and inline validation of model parameters and structure as scripts are being developed. In addition, automated multi-stage model validation ensures the code, equations and internal structure are validated against the NeuroML schema minimizing human errors and model simulations outputs are within acceptable bounds (Fig. 6). The NeuroMLv2 ecosystem also provides convenient ways to visualize and inspect the inner structure of models.

pyNeuroML provides Python functions and corresponding command line utilities to view neuronal morphology (Fig. 7), neuronal electrophysiology (Fig. 9), circuit connectivity and schematics (Fig. 8). In addition, custom analysis pipelines and advanced neuroinformatics resources can easily be built using the APIs. For example, loading a NeuroML model of a neuron into OSB enables visualization of the morphology and the spatial distribution of ionic conductance over the membrane as well as inspection of the conductance state variables, while the connectivity and synaptic weight matrices can be automatically displayed for circuit models (Fig. 7, Gleeson et al. (2019)). Such features of OSB, which are made possible by the structured format of NeuroMLv2, promote model transparency, reproducibility and sharing. By enabling the development and sharing of well tested and transparent models the wider NeuroMLv2 ecosystem promotes Open Science.

Limitations of NeuroML and current work

A limitation of any standardized framework is that there will always be models and model elements that fall outside the current scope of the standard. Although NeuroML suffers from this limitation, the underlying LEMS based framework provides a flexible route to develop a wide range of new types of physio-chemical models (Cannon et al., 2014). This is relatively straightforward if the new model component, such as a synaptic plasticity mechanism, fits within the existing hierarchical structure of NeuroMLv2, as the new type of synaptic element can build on an existing base synapse type, which specifies the relevant input and outputs (e.g. local voltage and synaptic current). For more radical shifts in model types (e.g. neuronal morphologies that grow during learning), that do not fit neatly into the current NeuroMLv2 schema, structural changes to the language would be required. This route is more involved as the pros and cons of changes to the structure of NeuroMLv2 would need to be considered by the Scientific Committee and, if approved, implemented by the Editorial Board.

Whereas the current scope of NeuroMLv2 encompasses models of spiking neurons and networks at different levels of biological detail, plans are in place to extend its scope to include more abstract, rate-based models of neuronal populations (e.g. see Wilson and Cowan (1972); Mejias et al. (2016) in Appendix 1 Table 7). Additionally, work is under way to extend current support for SBML (Hucka et al., 2003) based descriptions of chemical signaling pathways (Cannon et al., 2014), to enable better biochemical descriptions of sub-cellular activity in neurons and synapses.

There is a growing interest in the field for the efficient generation and serialization of large scale network models, containing numbers of neurons closer to their biological equivalents (Markram et al., 2015; Billeh et al., 2020; Einevoll et al., 2019). While a multitude of applications in the NeuroML ecosystem support large scale model generation (e.g. NetPyNE, neuroConstruct, PyNN), the default serialization of NeuroML (XML) is inefficient for reading/writing/storing such extensive descriptions. NeuroML does have an internal format for serializing in the binary format HDF5 (see Methods), but has also recently added support for export of models to the SONATA data format (Dai et al., 2020), allowing efficient serialization of large scale models. Even though individual instances of large scale models are useful, the ability to generate families of these for multiple simulation runs, and more particularly a way to encapsulate, examine and reuse templates for network models, is also required. A prototype package, NeuroMLlite (https://github.com/NeuroML/NeuroMLlite), has been developed which allows these concise network templates to be described and multiple instances of networks to be generated, and facilitates interaction with simulation platforms and efficient serialization formats.

As discoveries and insights in neuroscience inform machine learning and visa versa, there is an increasing need to develop a common framework for describing both biological and artificial neural networks. Model Description Format (MDF) has been developed to address this (Gleeson et al., 2023). This initiative has developed a standardized format, along with a Python API, which allows the specification of artificial neural networks (e.g. Convolutional Neural Networks, Recurrent Neural Networks) and biological neurons using the same underlying entities. Support for mapping MDF to/from NeuroMLv2/LEMS has been included from the start. This work will enable deeper integration of computational neuroscience and “brain-inspired” networks in Artificial Intelligence (AI).

Conclusion and vision for the future

NeuroMLv2 is already a mature community standard that provides a framework for standardizing biologically detailed neuronal network models. By providing a stable, common framework defining the essential entities required for biologically detailed neuronal modeling, NeuroML has spawned an ecosystem of tools that span all stages of the model development life cycle. In the short term, we envision the functionality of NeuroML to expand further and for new online resources that encourage the construction of FAIR models using pyNeuroML to be taken up by the community. The NeuroML development team are also beginning to explore how to combine NeuroML-based circuit models with musculo-skeletal simulations to enable models of the neural control of behavior. In the longer term, developing seamless interfaces between NeuroML and other domain specific standards will enable the development of more holistic models of the neural control of body systems across a wide range of organisms, as well as greater exchange of models and insights between computational neuroscience and AI.

Methods

NeuroMLv2 is formally specified by the NeuroMLv2 XML schema, which defines the allowed structure of XML files which comply to the standard, and the LEMS ComponentType definitions, which define the internal state variables of the underlying elements, providing a machine-readable specification of the time evolution of model components. The specification is backed up by a suite of software tools that support the model life cycle, and the accompanying usage and development documentation. We illustrate the key parts of this framework using the HindmarshRose cell model (Hindmarsh et al. (1984), Fig. 10), which, as an abstract point neuron model, serves as an appropriate simple NeuroMLv2 ComponentType.

Example model description of a HindmarshRose1984Cell NeuroML component. (a) XML serialization of the model description containing the main hindmarshRose1984Cell element, with a set of parameters which result in regular bursting. A current clamp stimulus is applied using a pulseGenerator, and a population of one cell is added with this in a network. This XML can be validated against the NeuroML Schema. (b) Membrane potentials generated from a simulation of the model in (a). The LEMS simulation file to execute this is shown in Fig. 14.

The NeuroML XML Schema

We begin with the NeuroMLv2 standard. The standard consists of two parts, each serving different functions:

the NeuroMLv2 XML schema
corresponding LEMS component type definitions

The NeuroMLv2 schema is a language independent data model that constrains the structure of a NeuroMLv2 model description. The NeuroML schema is formally described as an XML Schema document (https://neuroml.org/schema/neuroml2) in the XML Schema Definition (XSD) format, a recommendation of the World Wide Web Consortium (W3C) (https://www.w3.org/TR/xmlschema-1/). An XML document that claims to conform to a particular schema can be validated against the schema. All NeuroMLv2 model descriptions can therefore, be validated against the NeuroMLv2 schema.

The basic building blocks of an XSD schema are “simple” or “complex” types and their “attributes”. All types are created as “extensions” or “restrictions” of other types. Complex types may contain other types and attributes whereas simple types may not. Fig. 11 shows some example types defined in the NeuroMLv2 schema. For example, the Nml2Quantity_none simple type restricts the in-built “string” type using a regular expression “pattern” that limits what string values it can contain. The type is Nml2Quantity_none is to be used for unit-less quantities (e.g. 3, 6.7, −1.1e-5) and the restriction pattern for translates to “a string that may start with a hyphen (negative sign), followed by any number of numerical characters (potentially containing a decimal point) and a string containing capital or small ‘e’ (to specify the exponent)”. The restriction pattern for the Nml2Quantity_voltage type is similar, but must be followed by a “V” or “mV”. In this way, the restriction ensures that a value of type “Nml2Quantity_voltage” represents a physical voltage quantity with units “V” (volt) or “mV” (millivolt). Furthermore, a NeuroMLv2 model description that uses a voltage value that does not match this pattern, for example “0.5 s”, will be invalid.

Type definitions taken from the NeuroMLv2 schema, which describes the structure of NeuroMLv2 elements. Top: “simple” types may not include other elements or attributes. Here, the Nml2Quantity_none and Nml2Quantity_voltage types define restrictions on the default string type to limit what strings can be used as valid values for attributes of these types. Bottom: example of a “complex” type, the HindmarshRose cell model (Hindmarsh et al., 1984), that can also include other elements of other types, and extend other types.

The example of a complex type in Fig. 11 is the HindmarshRose1984Cell type that extends the BaseCellMembPotCap complex type (the base type for any cell producing a membrane potential v with a capacitance parameter C), and defines new “required” (compulsory) attributes. These attributes are of simple types—these are all unit-less quantities apart from v_scaling, which has dimension voltage. Note that inherited attributes are not re-listed in the complex type definition— the compulsory capacitance attribute, C, is inherited here from BaseCellMembPotCap.

The NeuroMLv2 schema serves multiple critical functions. A variety of tools and libraries support the validation of files against XSD schema definitions. Therefore, the NeuroMLv2 schema enables the validation of model descriptions—model structure, parameters, parameter values and their units, cardinality, element positioning in the model hierarchy (level 1 validation in Fig. 6)— prior to simulation. XSD schema definitions, as language independent data models, also allow the generation of APIs in different languages. More information on how APIs in different languages are generated using the NeuroMLv2 XSD schema definition is provided in later sections. The NeuroMLv2 XSD schema is also released and maintained as a versioned artifact, similar to the software packages. The current version is 2.3, and can be found in the NeuroML2 repository on GitHub (https://github.com/NeuroML/NeuroML2/tree/master/Schemas/NeuroML2).

LEMS ComponentType definitions

The second part of the NeuroMLv2 standard consists of the corresponding LEMS ComponentType definitions. Whereas the XSD Schema describes the structure of a NeuroMLv2 model description, the LEMS ComponentType definitions formally describe the dynamics of the model elements.

LEMS (Cannon et al., 2014) is a domain independent general purpose machine-readable language for describing models and their simulations. A complete description of LEMS is provided in Cannon et al. (2014) and in our documentation (https://docs.neuroml.org/Userdocs/LEMSSchema.html). Here, we limit ourselves to a short summary necessary for understanding the NeuroMLv2 ComponentType definitions.

LEMS allows the definition of new model types called ComponentTypes. These are formal descriptions of how a generic model element of that type behaves (the “dynamics”), independent of the specific set of parameters in any instance. To describe the dynamics, such descriptions must list any necessary parameters that are required, as well as the time-varying state variables. The dimensions of these parameters and state variables must be specified, and any expressions involving them must be dimensionally consistent. An instance of such a generic model is termed a Component, and can be instantiated from a ComponentType by providing the necessary parameters. One can think of ComponentTypes as user defined data types similar to “classes” in many programming languages, and Components as “objects” of these types with particular sets of parameters. Types in LEMS can also extend other types, enabling the construction of a hierarchical library of types. In addition, since LEMS is designed for model simulation, ComponentType definitions also include other simulation related features such as Exposures, specifying quantities that may be accessed/recorded by users.

For model elements included in the NeuroML standard, there is a one-to-one mapping between types specified in the NeuroML XSD schema, and LEMS ComponentTypes, with the same parameters specified in each. The addition of new model elements to the NeuroML standard, therefore, requires the addition of new type definitions to both the XSD schema and the LEMS definitions. New user defined ComponentTypes, nevertheless, can be defined in LEMS and used freely in models, and these do not need to be added to the standard before use. The only limitation here is that new user defined ComponentTypes cannot be validated against the NeuroML schema, since their type definitions will not be included there.

Fig. 12 shows the ComponentType definition for the HindmarshRose1984Cell model element. Here, the HindmarshRose1984Cell ComponentType extends baseCellMembPotCap and inherits its elements. The ComponentType includes parameters that users must provide when creating a new instance (component): a, b, c, d, r, s, x1, v_scaling.

LEMS ComponentType definition of the HindmarshRose cell model (Hindmarsh et al., 1984).

Other parameters, x0, y0, and z0 are used to initialize the three state variables of the model, x, y, z. x is the proxy for the membrane potential of the cell used in the original formulation of the model (Hindmarsh et al., 1984) and is here scaled by a factor v_scaled to expose a more physiological value for the membrane potential of the cell in StateVariable v. A Constant MSEC is defined to hold the value of 1 ms for use in the ComponentType. Next, an Attachment enables the addition of entities that would provide external inputs to the ComponentType. Here, synapses are Attachments of the type basePointCurrent and provide synaptic current input to this ComponentType.

The Dynamics block lists the mathematical formalism required to simulate the ComponentType. By default, variables defined in the Dynamics block are private, i.e., they are not visible outside the ComponentType. To make these visible to other ComponentTypes and to allow users to record them, they must be connected to Exposures. Exposures for this ComponentType include the three state variables, but also the internal derived variables, which while not used by other components, are useful in inspecting the ComponentType and its dynamics. An extra exposure, stateVariable v, is added to allow other NeuroML components access to the spiking state of the cell that will be determined in the Dynamics block.

StateVariable definitions are followed by DerivedVariables, variables whose values depend on other variables but are not time derivatives (which are handled separately in TimeDerivative blocks (below)). The total synaptic current, iSyn, is a summation of all the synaptic currents, i received by the synapses that may be attached on to this ComponentType. The synapse[*]/i value of the select field tells LEMS to collect all the i exposures from any synapses Attachments, and the add value of the reduce field tells LEMS to sum the multiple values. As noted, x is a scaled version of the membrane potential variable, v. This is followed by the three derived variables, phi, chi, rho where:

The total membrane potential of the cell, iMemb is calculated as the sum of the capacitive current and the synaptic current:

v, y, z are TimeDerivatives, with the “value” representing the rate of change of each variable:

The final few blocks set the initial state of the component (OnStart),

and define conditional expressions to set the spiking state of the cell:

Both the XSD schema and the LEMS ComponentType definitions enable model validation. However, despite some overlap, they support different types of validation. Whereas the XSD schema allows for the validation of model descriptions (e.g. the XML files), the LEMS ComponentType definitions enable validation of model instances, i.e., the “runnable” instances of models that are constructed once components have been created by instantiating ComponentTypes with the necessary parameters, and various attachments created between source and target components. A model description may be used to create many different model instances for simulation. Indeed, it is common practice to run models that include stochasticity with different seeds for random number generators to verify the robustness of simulation results. Thus, the validation of dimensions and units that LEMS carries out is done only after a runnable instance of a model has been created. The LEMS ComponentType definitions for NeuroMLv2 are also maintained as versioned files that are updated along with the XSD schema. These can also be seen in the NeuroMLv2 GitHub repository (https://github.com/NeuroML/NeuroML2/tree/master/NeuroML2CoreTypes). An index of the ComponentTypes included in version 2.3 of the NeuroML standard, with links to online documentation, is also provided in Appendix 1 Tables 5 and 6.

NeuroML APIs

The NeuroMLv2 software stack relies on the NeuroML APIs that provide functionality to read, write, validate, and inspect NeuroML models. The APIs are programmatically generated from the machine readable XSD schema, thus ensuring that the class for defining a specific NeuroML element in a given language (e.g. Java) has the correct set of fields with the appropriate type (e.g. float or string) corresponding to the allowed parameters in the corresponding NeuroML element. NeuroMLv2 currently provides APIs in numerous languages—Python (libNeuroML which is generated via generateDS (http://www.davekuhlman.org/generateDS.html)), Java (org.neuroml.model via JAXB XJC (https://javaee.github.io/jaxb-v2/)), C++ (NeuroML_CPP via XSD (https://www.codesynthesis.com/products/xsd/)) and MATLAB (NeuroMLToolbox which accesses the Java API from MATLAB), and APIs for other languages can also be easily generated as required. LEMS is also supported by a similar set of APIs—PyLEMS in Python, and jLEMS in Java—and since a NeuroMLv2 model description is a set of LEMS Components, the LEMS APIs also support them (e.g. the hindmarshRose1984Cell example in Fig. 10 could be loaded by jLEMS and treated as a LEMS Component).

Fig. 13 shows the use of the NeuroML Python API to describe a model with one HindmarshRose cell. In Python, the instances of ComponentTypes, their Components, are represented as Python objects. The hr0 Python variable stores the created HindmarshRose1984Cell component/object. This is added to a Population, pop0 in the Network net. The network also includes a PulseGenerator with amplitude 5 nA as an ExplicitInput that is targeted at the cell in the population. The model description is serialized to XML (Fig. 10) and validated. Note that, as the standard convention for classes in Python is to use capitalized names, HindmarshRose1984Cell is used in Python, which is serialized as <hindmarshRose1984Cell> in the XML. Users can either share the Python script itself, or the XML serialization. Any valid XML serialization can be also loaded into a Python object model and modified.

Example model description of a HindmarshRose1984Cell NeuroML component in Python, using parameters for regular bursting. This script generates the XML in Fig. 10.

XML is the default serialization of NeuroML, and all existing APIs can read and write the format (and it should be seen as a minimal requirement for new APIs to support XML). There is however an alternative HDF5 (https://www.hdfgroup.org/solutions/hdf5) based serialization of NeuroML files which is supported by both libNeuroML and the Java API, org.neuroml.model (https://docs.neuroml.org/Userdocs/HDF5.html). This format is based on an efficient representation of cell positions and connectivity data as HDF5 data sets, which can be serialized in compact binary format, and loaded into memory for optimized access (e.g. as numpy arrays in libNeuroML). This reduces the size of the saved files for large scale networks, and speeds up loading/writing models, eliminating the need to parse/generate large text files containing XML. Models serialized in this format can be loaded and transformed to simulator code in the same way as XML based models by the Java and Python APIs.

Simulating NeuroML models

The model description shown in Fig. 10 contains no information about how it is to be simulated, or on the dynamics of each model component. Providing this simulation information and linking in the ComponentType definition requires creating a LEMS file to fully specify the simulation. Fig. 14 shows the use of utilities included in the Python pyNeuroML package to describe a LEMS simulation As noted previously, NeuroML/LEMS model and simulation descriptions are machine readable and simulator independent and can be simulated by simulation engines using a multitude of strategies (Fig. 5).

An example simulation of the HindmarshRose model description shown in Fig. 13, with the LEMS serialization shown at the bottom.

The first category of tools consists of the reference NeuroML and LEMS simulation engines. These work directly with NeuroML and LEMS as their base descriptions of modeling entities and do not have their own specific formats. They are maintained by the NeuroML Editorial Board— jLEMS, jNeuroML, and PyLEMS (Fig. 4). jLEMS serves as the reference implementation for the LEMS language, and as such it can simulate any model described in LEMS (not necessarily from neuroscience). When coupled with the LEMS definitions of NeuroML standard entity structure/dynamics, it can simulate most NeuroML models, though it does not currently support multi-compartmental neurons. jNeuroML bundles the NeuroML standard LEMS definitions, jLEMS and other functionality into a single package for ease of installation/usage. There is also a pure Python implementation of a LEMS interpreter, PyLEMS, which can be used in a similar way to jLEMS. The pyNeuroML package encapsulates all of these tools to give easy access (at both command line and in Python) to all of their functionality (Box 1).

The second category consists of other simulators which support NeuroML natively. The EDEN simulator is an independently developed tool that was designed from its inception to read NeuroML and LEMS models for efficient, parallel simulation (Panagiotou et al., 2022).

The third category involves simulators which have their own internal formats, and include methods to translate NeuroMLv2/LEMS models to their own formats. Examples include NetPyNE (DuraBernal et al., 2019), MOOSE (Ray and Bhalla, 2008), and N2A (Rothganger et al., 2014).

The fourth category comprises tools for which the NeuroML tools generate simulator specific scripts. The simulation engines then execute these scripts, similar to how they would execute handwritten user scripts. These include NEURON (Hines and Carnevale, 1997) for which the NeuroML tools generate scripts in Python and the simulator’s hoc and NMODL formats and the Brian simulator (Stimberg et al., 2019) which uses Python scripts.

The final category consists of export options to standardized formats in neuroscience and the wider computational biology field, which enable interaction with simulators and applications supporting those formats. These include the PyNN package (Davison et al., 2009), which can be run in either NEURON, NEST (Gewaltig and Diesmann, 2007) or Brian, the SONATA data format (Dai et al., 2020) and the SBML standard (Hucka et al., 2003) (see Reusing NeuroML models for more details).

Having multiple strategies in place for supporting NeuroML gives more freedom to simulator developers to choose how much they wish to be involved with implementing and supporting NeuroML functionality in their applications, while maximizing the options available for end users.

The primary tool for simulating NeuroML/LEMS models via different engines is jNeuroML, which is included in pyNeuroML. jNeuroML supports all simulator engine categories (Fig. 5). It includes jLEMS for simulation of LEMS and single compartmental NeuroML models. It can also pass simulations to the EDEN simulator (Panagiotou et al., 2022) for direct simulation. Using the org.neuroml.export library (https://github.com/NeuroML/org.neuroml.export), jNeuroML can also generate import scripts for simulators (e.g. NetPyNE (Dura-Bernal et al., 2019)) or convert NeuroML/LEMS models to simulator specific formats (e.g. NEURON (Hines and Carnevale, 1997)). Supporting a new simulation engine that requires translation of NeuroML/LEMS into another format can be done by adding a new “writer” to the org.neuroml.export library. Finally, jNeuroML also includes the org.neuroml.import (https://github.com/NeuroML/jNeuroML) library that converts from other formats (e.g. SBML (Hucka et al., 2003)) to LEMS for combination with NeuroML models.

It is important to note though that not all NeuroML models can be exported to/are supported by each of these target simulators (Table 3). This depends on the capabilities of the simulator in question (whether it supports networks, or morphologically detailed cells) and pyNeuroML/jNeuroML will provide feedback if a feature of the model is not supported in a chosen environment.

All NeuroML and LEMS software packages are made available under FOSS licenses. The source code for all NeuroML packages and the standard can be obtained from the NeuroML GitHub organization (https://github.com/NeuroML). The NeuroML Python API (https://github.com/NeuralEnsemble/libNeuroML) was developed in collaboration with the NeuralEnsemble initiative (https://github.com/NeuralEnsemble/), which also maintains other commonly used Python packages such as PyNN (Davison et al., 2009), Neo (Garcia et al., 2014) and Elephant (Denker et al., 2018). LEMS packages are available from the LEMS GitHub organization (https://github.com/LEMS).

To ensure replication and reproduction of studies, it is important to note the exact versions of software used in studies. For NeuroML and LEMS packages, archives of each release along with citations are published on Zenodo (https://zenodo.org) to enable researchers to cite them in their work.

Documentation

A standard and its accompanying software ecosystem must be supported by comprehensive documentation if it is to be of use to the research community. The primary NeuroML documentation for users that accompanies this paper has been consolidated into a JupyterBook (Executable Books Community, 2020) at https://docs.neuroml.org. This includes explanations of NeuroML and computational modeling concepts, interactive tutorials with varying levels of complexity, information about tools and what functions they provide to support different stages of the model life cycle. The JupyterBook framework supports “executable” documentation through the inclusion of interactive Jupyter notebooks which may be run in the users’ web browser on free services such as OSBv2, Binder.org (https://mybinder.org/) (Project Jupyter et al., 2018) and Google Colab (https://colab.research.google.com/). Finally, the machine readable nature of the schema and LEMS also enables the automated generation of human readable documentation for the standard and low level APIs (Fig. 15) along with their examples (https://docs.neuroml.org/Userdocs/Schemas/Cells.html#hindmarshrose1984cell). In addition, the individual NeuroML software packages each have their own individual documentation (e.g. pyNeuroML (https://pyneuroml.readthedocs.io/en/stable/), libNeuroML (https://libneuroml.readthedocs.io/en/stable/)).

As with the rest of the NeuroML ecosystem, the documentation is hosted on GitHub (https://github.com/NeuroML/Documentation), licensed under a FOSS license, and community contributions to it are welcomed. A PDF version of the documentation can also be downloaded for offline use (https://docs.neuroml.org/_static/files/neuroml-documentation.pdf).

Maintenance of the Schema and core software

The NeuroML Scientific Committee (https://docs.neuroml.org/NeuroMLOrg/ScientificCommittee.html) and the elected NeuroML Editorial Board (https://docs.neuroml.org/NeuroMLOrg/Board.html) oversee the standard, the core tools, and the initiative. The Scientific Committee sets the scientific focus of the NeuroML initiative. It ensures that the standard represents the state of the art—that it can encapsulate the latest knowledge in neuronal anatomy and physiology in their corresponding model components. The Scientific Committee also defines the governance structure of the initiative, and works with the wider scientific community to gather feedback on NeuroML and promote its use. The Editorial Board manages the day to day development and maintenance of LEMS, the NeuroML schema, the core software tools, and critical resources such as the documentation. The Editorial Board works with simulator developers in the extended ecosystem to help make tools NeuroML compliant by testing reference implementations and answering technical queries about NeuroML and the core software tools.

Acknowledgements

We thank all the members of the NeuroML Community who have contributed to the development of the standard over the years, have added support for the language to their applications, or who have converted published models to NeuroML. We would particularly like to thank the following for contributions to the NeuroML Scientific Committee: Upi Bhalla, Avrama Blackwell, Hugo Cornells, Robert McDougal, Lyle Graham, Cengiz Gunay and Michael Hines. The following have also contributed to developments related to the named tools/simulators/resources: EDEN Mario Negrello and Christos Strydis, SONATA Anton Arkhipov and Kael Dai, MOOSE Subhasis Ray, NeuroMLDB Justas Birgiolas, NeuroMorpho.Org Giorgio Ascoli, N2A Fred Rothganger, pyLEMS Gautham Ganapathy, MDF Manifest Chakalov, libNeuroML and NeuroTune Mike Vella, Open Source Brain Matt Earnshaw, Adrian Quintana and Eugenio Piasini, SciUnit/NeuronUnit Richard C Gerkin, Brian Marcel Stimberg and Dominik Krzemiński, Arbor Nora Abi Akar, Thorsten Hater and Brent Huisman, BluePyOpt Jaquier Aurélien Tristan and Werner van Geit, C++/MATLAB APIs Jonathan Cooper. We thank Rokas Stanislavos, András Ecker, Jessica Dafflon, Ronaldo Nunes, Anuja Negi and Shayan Shafquat for their work converting models to NeuroML format as part of the Google Summer of Code program. We also thank Diccon Coyle for feedback on the manuscript.

Significance of findings

Strength of evidence

Abstract

Introduction

Results

NeuroML provides a ready to use set of curated model elements

NeuroML is a modular, structured language for defining FAIR models

NeuroML supports a large ecosystem of software tools that cover all stages of the model life cycle

Creating NeuroML models

Validating NeuroML models

Visualizing/analyzing NeuroML models

Box 1.

NeuroML Python tools for users

Simulating NeuroML models

Optimizing NeuroML models

Sharing NeuroML models

Reusing NeuroML models

NeuroML is extensible

NeuroML is a global open community initiative

Discussion

Evolution of NeuroML and emergence of the NeuroMLv2 tool ecosystem

NeuroML and other standards in computational neuroscience

Why using NeuroML and Python promotes the construction of FAIR models

Limitations of NeuroML and current work

Conclusion and vision for the future

Methods

The NeuroML XML Schema

LEMS ComponentType definitions

NeuroML APIs

Simulating NeuroML models

Documentation

Maintenance of the Schema and core software

Acknowledgements

Funding

NeuroML software core tools and libraries, with a description of their scope, the main programming language they use (or other interaction means, e.g. Command Line Interface (CLI)), and links for more information.

Tools in the wider NeuroML software ecosystem, with a description of their scope, the main programming language they use (or other interaction means, e.g. through a web browser, Graphical User Interface (GUI) or Command Line Interface (CLI)), and links for more information.

Listing of validation tests run by NeuroML

Index of standard NeuroMLv2 ComponentTypes

Index of standard NeuroMLv2 ComponentTypes (continued)

Listing of NeuroML models and example repositories

References

Article and author information

Author information

Ankur Sinha†

Padraig Gleeson†

Bóris Marin

Salvador Dura-Bernal

Sotirios Panagiotou

Sharon Crook

Matteo Cantarelli

Robert C Cannon

Andrew P Davison

Harsha Gurnani

R Angus Silver

Version history

Cite all versions

Copyright

Metrics

Ankur Sinha

Padraig Gleeson