In late 2012, Eftychios Pnevmatikakis — then a postdoctoral researcher in statistical analysis of neural data at Columbia University — began working on a challenge he thought would be key to neuroscience research in the coming years: how to handle the flood of data emerging from ‘calcium imaging’ of animals’ brains.
The technology, which makes movies of animals’ brains by measuring light emitted from a protein as it binds to calcium ions in neurons, was starting to offer an exciting new window into living brains. The technique noninvasively captures the activity of large brain regions at single-neuron resolution and allows researchers to control exactly which neurons, and even which type of neurons, are being measured. In recent years, the use of calcium imaging has exploded, with neuroscientists making movies of insect, rodent and even primate brains.
Yet Pnevmatikakis was concerned that the technology might soon become a victim of its own success. Until a few years ago, calcium imaging movies were small enough that a laboratory assistant could manually label the neurons, frame by frame. Today, however, it is common to collect 100,000 frames per hour, each containing thousands of neurons.
Pnevmatikakis developed an algorithm that could automatically identify the neurons in certain datasets obtained from his colleagues at Columbia, and in 2014, he came to the Flatiron Institute (then called the Simons Center for Data Analysis, or SCDA) with a mandate to bridge the gap from his algorithm to a broadly useful software platform. “The mission statement of SCDA was to fill exactly this kind of gap, so I thought it would be a good match,” Pnevmatikakis says.
Pnevmatikakis soon brought on board Andrea Giovannucci, a neuroscientist who also has a Ph.D. in computer science. Giovannucci had been trying to interpret calcium imaging movies he’d made of mouse cerebellar granule cells, which are tiny and dense. “Recording them is like trying to distinguish specific voices in a full stadium,” Giovannucci says. With help from Pnevmatikakis, however, he was able to figure out how to adapt Pnevmatikakis’ algorithm to this setting, making it possible, he says, “to single out sentences from the heavy background noise and overlapping voices.”
The pair realized that to make a widely useful tool, they would have to look beyond Pnevmatikakis’ original tight focus on neuron detection. So they have also examined how to correct for animal motion during filming, and developed tools to benchmark the software’s performance.
The pair realized that to make a widely useful tool, they would have to look beyond Pnevmatikakis’ original tight focus on neuron detection.
The resulting software, now called CaImAn, has been freely available to the public since 2015 and has been widely adopted by neuroscientists who do calcium imaging. Meanwhile, Pnevmatikakis and Giovannucci are continuing to improve CaImAn. Most recently, in June 2017, they extended its functionality to identify neurons in real time, as data stream through the software frame by frame. The innovation means that researchers can run CaImAn on ordinary computers that don’t have enough storage for the entire dataset. “We like to say that we develop algorithms to do data analysis for the 99 percent,” Pnevmatikakis says.