Researchers create 'Wikipedia' for neurons
The decades worth of data that has been collected about the billions of neurons in the brain is astounding. To help scientists make sense of this "brain big data," researchers at Carnegie Mellon University have used data mining to create http://www.neuroelectro.org, a publicly available website that acts like Wikipedia, indexing physiological information about neurons.
The site will help to accelerate the advance of neuroscience research by providing a centralized resource for collecting and comparing data on neuronal function. A description of the data available and some of the analyses that can be performed using the site are published online by the Journal of Neurophysiology.
The neurons in the brain can be divided into approximately 300 different types based on their physical and functional properties. Researchers have been studying the function and properties of many different types of neurons for decades. The resulting data is scattered across tens of thousands of papers in the scientific literature. Researchers at Carnegie Mellon turned to data mining to collect and organize these data in a way that will make possible, for the first time, new methods of analysis.
"If we want to think about building a brain or re-engineering the brain, we need to know what parts we're working with," said Nathan Urban, interim provost and director of Carnegie Mellon's BrainHubSM neuroscience initiative. "We know a lot about neurons in some areas of the brain, but very little about neurons in others. To accelerate our understanding of neurons and their functions, we need to be able to easily determine whether what we already know about some neurons can be applied to others we know less about."
Shreejoy J. Tripathy, who worked in Urban's lab when he was a graduate student in the joint Carnegie Mellon/University of Pittsburgh Center for the Neural Basis of Cognition (CNBC) Program in Neural Computation, selected more than 10,000 published papers that contained physiological data describing how neurons responded to various inputs. He used text mining algorithms to "read" each of the papers. The text mining software found the portions of each paper that identified the type of neuron studied and then isolated the electrophysiological data related to the properties of that neuronal type. It also retrieved information about how each of the experiments in the literature was completed, and corrected the data to account for any differences that might be caused by the format of the experiment. Overall, Tripathy, who is now a postdoc at the University of British Columbia, was able to collect and standardize data for approximately 100 different types of neurons, which he published on the website neuroelectro.org.
Since the data on the website was collected using text mining, the researchers realized that it was likely to contain errors related to extraction and standardization. Urban and his group validated much of the data, but they also created a mechanism that allows site users to flag data for further evaluation. Users also can contribute new data with minimal intervention from site administrators, similar to Wikipedia.
"It's a dynamic environment in which people can collect, refine and add data," said Urban, who is the Dr. Frederick A. Schwertz Distinguished Professor of Life Sciences and a member of the CNBC. "It will be a useful resource to people doing neuroscience research all over the world."
Ultimately, the website will help researchers find groups of neurons that share the same physiological properties, which could provide a better understanding of how a neuron functions. For example, if a researcher finds that a type of neuron in the brain's neocortex fires spontaneously, they can look up other neurons that fire spontaneously and access research papers that address this type of neuron. Using that information, they can quickly form hypotheses about whether or not the same mechanisms are at play in both the newly discovered and previously studied neurons.
To demonstrate how neuroelectro.org could be used, the researchers compared the electrophysiological data from more than 30 neuron types that had been most heavily studied in the literature. These included pyramidal neurons in the hippocampus, which are responsible for memory, and dopamine neurons in the midbrain, thought to be responsible for reward-seeking behaviors and addiction, among others. The site was able to find many expected similarities between the different types of neurons, and some similarities that were a surprise to researchers. Those surprises represent promising areas for future research.
In ongoing work, the Carnegie Mellon researchers are comparing the data on neuroelectro.org with other kinds of data, including data on neurons' patterns of gene expression. For example, Urban's group is using another publicly available resource, the Allen Brain Atlas, to find whether groups of neurons with similar electrical function have similar gene expression.
"It would take a lot of time, effort and money to determine both the physiological properties of a neuron and its gene expression," Urban said. "Our website will help guide this research, making it much more efficient."