Illuminating the Dark Parameter Space of Neuroscience Modeling
To understand how neural circuits create patterns of activity and behavior, computational neuroscientists build mathematical models. These computer simulations are meant to replicate the real circuits found in biology, with different parameters of the model representing different physiological properties of neurons, such as the strength of connections between neurons or the resting potential of a cell. By investigating these models, scientists can perform in silico experiments that would not otherwise be possible, generate new hypotheses, and gain insights into how real neural networks work.
Two preprints posted to bioRxiv simultaneously in November 2019, one by Jakob Macke, a theoretical neuroscientist at the Technical University of Munich, and collaborators, and the other by John Cunningham, a statistician and neuroscientist at Columbia University and an investigator with the Simons Collaboration on the Global Brain (SCGB), and collaborators, are helping neuroscientists build and understand these models more efficiently. In addition, Cunningham’s team has developed a new online platform, presented at the Cosyne meeting in Denver in February, that is making methods like these easier to use and more available to everyone.
One of the biggest challenges in building mathematical circuit models is in choosing the exact values each parameter should take to make the model circuit function like the real one. Some of these parameter values can be gleaned from experiments. But in many cases, there are just too many unknowns to build a circuit model from the ground up. To remedy this, scientists have turned to new parameter-fitting techniques that make the process faster and more comprehensive.
The most basic way to explore parameter space is simple trial and error: produce many different combinations of parameter values, try them out, and see if they make the model behave like the biology. This method has been used successfully in many studies. One influential example is work from the lab of Eve Marder, a neuroscientist at Brandeis University, that explores the stomatogastric ganglion (STG), a small circuit in the guts of crustaceans that produces the rhythms needed for digestion. In an initial study, first author Mark Goldman, now at the University of California, Davis and an SCGB PI, varied five different parameters in a model of an individual neuron and measured how the response properties of the neuron — such as whether it exhibited bursting or not — changed as a result. This work revealed that several different sets of parameter values could produce the same behavior in the neuron. Experiments on STG neurons verified that this phenomena exists in the biology as well as in the model.
In a subsequent study, this concept was extended to the STG circuit as a whole. First author Astrid Prinz generated over 20 million parameters sets — each defined by a set of values representing membrane conductances and coupling strengths between the different neurons in the circuit. They found that select sets of parameters with very different values could produce the right rhythms. Together these results may explain how animals with different circuit properties can produce the same behavior.
The trial-and-error approach to finding parameters is time-consuming, however, and because it only provides an answer for the exact parameters tested, it may lead to a limited view of the model’s behavior.
Statisticians prefer to take a more rigorous approach when fitting models. Specifically, they aim to calculate the posterior distribution, which describes the probability that a given set of parameters is correct, given the data the researchers are trying to replicate. The posterior gives a full picture of the model’s behavior, describing which parameter regimes will be good or bad without the need to run a simulation of the model for each one. It also lets modelers describe the parameter fit in terms of uncertainty — in shades of gray, rather than only black or white.
Directly calculating this posterior distribution is nearly impossible for all but the simplest statistical models, making this approach largely irrelevant for fitting the complex circuit models used in neuroscience. Statisticians have long worked to devise methods that can approximate the posterior distribution in situations where it can’t be defined analytically, but even these approximation methods can be too restricted for use on circuit models.
“These are very difficult problems,” says Luigi Acerbi, a postdoctoral researcher at the University of Geneva who has used statistical methods to fit cognitive models but was not involved in the current work. When it comes to making these methods work on real problems, Acerbi says, “people don’t realize how hard it is.”
Capturing a complex relationship
To overcome this difficulty, the two new efforts use an increasingly valuable technique in statistics and data science: deep neural networks, “which are incredibly good at extracting and finding patterns,” says Acerbi. They are therefore useful tools for capturing the complicated relationship between parameter values and the posterior probability.
Macke’s study, with first authors Pedro Gonçalves, Jan-Matthis Lueckmann and Michael Deistler, describes sequential neural posterior estimation, or SNPE. With this method, model builders start by inputting the dataset they want their model to replicate. Being able to fit parameters to a full dataset can be helpful when researchers are not sure which features of the data are most important to capture. Users can also input whatever information they have about the physiological properties in the form of a prior distribution over parameters. The model is then run many times using different sets of parameters sampled from this prior distribution, creating a database on which a deep neural network can be trained. This deep neural network learns to take in data as input and output a posterior distribution over parameters.
By finding an approximation of the posterior, the SNPE method can, once trained, provide a fuller picture of which sets of parameters can reproduce the data and which can’t. It both automates the process of parameter search that is normally done by hand and identifies all possible parameter regimes that replicate the data.
To check that their method was working as anticipated, the authors applied it to a problem where it was possible to directly calculate the posterior. Comparing their method with this ground truth, they found that SNPE is indeed capable of efficiently finding an accurate approximation of the posterior. Macke presented the research at the Neuromatch 2.0 virtual conference in May.
Cunningham’s study, with first author Sean Bittner, introduced a complementary method: emergent property inference, or EPI. In EPI, a desired ‘emergent property’ of the model, such as a certain pattern of firing rates or a specific response to input, is defined mathematically. Aiming to capture emergent properties — rather than trying to replicate all the details of a full dataset — is common in computational modeling of neural circuits. This is because emergent properties may better capture the features of the data that are relevant for computation. In addition, because the same emergent property can be realized in many different ways in a full dataset, a focus on emergent properties lets the model builder ignore these presumably irrelevant differences. Furthermore, limitations in the way data are reported or how the model is constructed may make emergent properties the most reasonable option for model fitting.
As with SNPE, many simulations of the model are run using different parameters in order to identify those that replicate the emergent property and those that don’t. This information is used to train a deep neural network to produce a probability distribution over the parameters of the model. Without further assumptions put in, this distribution will identify all possible parameter regimes that can replicate the emergent property. Researchers can then see which sets of model parameters will create the behavior they wish to see in their model.
To demonstrate their methods’ abilities, both teams applied their method to the classic STG circuit model. Cunningham and colleagues defined their desired emergent property for this model as ‘network syncing,’ a phenomenon in which the neurons in the network oscillate at the same defined frequency. By analyzing the probability distribution produced by EPI, they identified changes in parameters that would inhibit network syncing and those that would keep it intact.
Macke’s group used SNPE on their model of the ganglion and found that the same rhythmic pattern can be produced using very different sets of parameters — a result in line with the large parameter sweep performed by the Marder lab. What’s more, through their method they were able to find paths between these different parameter regimes that preserve the rhythmic activity throughout.
The results from both methods support the idea that circuits with very different properties can still produce the same behaviors. But, crucially, these methods can reach that conclusion with far fewer simulations than the standard trial-and-error approach.
Parameter fitting’s diverse potential
Both groups also showcased what their methods can do on a host of other problems.
EPI, for example, was applied to a multi-neuron type model of primary visual cortex, a task-based model of superior colliculus, and a recurrent neural network model that performs a mathematical task. In addition, SNPE was used to understand the functioning of ion channels and a Hodgkin-Huxley model of neural firing. Such an array of applications shows the diverse potential of these methods; the mathematical and biological details of the model are largely irrelevant to the parameter-fitting procedure. “They’re both very remarkable results,” says Acerbi. “Solving such a technical problem is a massive contribution.”
The ability of these methods to deal with high-dimensional parameter spaces is particularly notable. While previous methods worked well when fitting fewer than 10 parameters, the current approaches can scale to more than three times as many. Because the amount of time it takes to perform parameter sweeps through trial and error increases combinatorially with the number of parameters, efficient methods for exploring parameter space are particularly important in high dimensions.
Though the two methods address similar needs in computational neuroscience, they each arise from different lines of work from their respective labs over several years. As a result, they differ under the hood in ways that may impact their performance on different models. For example, while SNPE can also be used to fit summary statistics rather than full datasets, EPI was specifically designed for fitting simple emergent properties. What these differences mean in practice is yet to be explored.
As with all new methods, it’s not yet clear how these new approaches for parameter exploration will hold up against the challenges circuit modelers can throw at them. “It’s hard to tell how they will fare in the wild,” says Acerbi. One important consideration is the amount of hand-tuning required to get them to work. Because these methods involve training deep neural networks, users need to choose hyper-parameters associated with this learning, for example. Another question is how robustly these methods will work under different modeling and data assumptions. Both will be important factors in how widely these methods are used.
Of course, these methods can only have an impact if people can readily use them. To help users get started, Macke and his team have released a Python toolbox to implement their SNPE method. EPI is available through the Cunningham lab’s GitHub page. In addition to sharing their code, Cunningham debuted a new platform for sharing and executing methods like EPI in an invited lecture at Cosyne in February. The platform, called Neuroscience Cloud Analysis as a Service or NeuroCAAS, aims to bring the infrastructure and efficiency of production-level data science to neuroscience.
At present, individual labs each develop their own code base, and the implementation of algorithms can vary across labs. This duplication of effort is unnecessary and can mean that the same algorithm produces different results in different labs. Furthermore, labs that are restricted by the computing power and expertise available to them can miss out on these algorithms entirely. NeuroCAAS aims to address these problems.
Data can be dragged and dropped into the online NeuroCAAS interface, where a selection of methods and algorithms will be run on it in the cloud. Providing compute resources in this way ensures that neuroscientists have equal access to cutting-edge algorithms. The interface also provides clear documentation of what analyses have been done and how, leaving a paper trail that can help with reproducibility.
In addition to EPI, NeuroCAAS offers analysis for processing calcium-imaging data and for tracking movement in behavioral videos. Claire Everett, a doctoral student at Columbia University in Andrés Bendesky’s lab, used the service to analyze videos of aggressive displays in Siamese fighting fish. According to Everett, NeuroCAAS allowed her to “forgo the tedious and extremely frustrating step of downloading and installing the tools needed for modern data analysis.” The efficiency of cloud computing also ensures a quick runtime. “I can analyze my videos at a fraction of the time compared to other services,” Everett says.
The platform is currently in closed alpha testing and is evolving to respond to its users’ needs. For example, Everett appreciates the work the team is putting into documenting the inner workings of the algorithms. “As an experimentalist, I have an instinct of wanting to see everything that is happening behind the curtain. I don’t wish to just give my data to someone and receive back the results.” If you would like to become a tester for NeuroCAAS or collaborate with the team, visit their website or email [email protected].