Global brain initiatives generate tsunami of neuroscience data
Three years ago the White House launched the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative to accelerate the development and application of novel technologies that will give us a better understanding about how brains work.
Since then, dozens of technology firms, academic institutions, scientists and other have been developing new tools to give researchers unprecedented opportunities to explore how the brain processes, utilizes, stores and retrieves information. But without a coherent strategy to analyze, manage and understand the data generated by these new technologies, advancements in the field will be limited.
This is precisely why Lawrence Berkeley National Laboratory (Berkeley Lab) Computational Neuroscientist Kristofer Bouchard assembled an international team of interdisciplinary researchers—including mathematicians, computer scientists, physicists and experimental and computational neuroscientists—to develop a plan for managing, analyzing and sharing neuroscience data. Their recommendations were published in a recent issue of Neuron.
"The U.S. BRAIN Initiative is just one of many national and private neuroscience initiatives globally that are working toward accelerating our understanding of brains," says Bouchard. "Many of these efforts have given a lot of attention to the technological challenges of measuring and manipulating neural activity, while significantly less attention has been paid to the computing challenges associated with the vast amounts of data that these technologies are generating."
To maximize the return on investments in global neuroscience initiatives, Bouchard and his colleagues argue that the international neuroscience community should have an integrated strategy for data management and analysis. This coordination would facilitate the reproducibility of workflows, which then allows researchers to build on each other's work.
For a first step, the authors recommend that researchers from all facets of neuroscience agree on standard descriptions and file formats for products derived from data analysis and simulations. After that, the researchers should work with computer scientists to develop hardware and software ecosystems for archiving and sharing data.
The authors suggest an ecosystem similar to the one used by the physics community to share data collected by experiments like the Large Hadron Collider (LHC). In this case, each research group has their own local repository of physiological or simulation data that they've collected or generated. But eventually, all of this information should also be included in "meta-repositories" that are accessible to the greater neuroscience community. Files in the "meta-repositories" should be in a common format, and the repositories would ideally be hosted by an open-science supercomputing facility like the Department of Energy's (DOE's) National Energy Research Scientific Computing Center (NERSC), located at Berkeley Lab.
Because novel technologies are producing unprecedented amounts of data, Bouchard and his colleagues also propose that neuroscientists collaborate with mathematicians to develop new approaches for data analysis and modify existing analysis tools to run on supercomputers. To maximize these collaborations, the analysis tools should be open-source and should integrate with brain-scale simulations, they emphasize.
"These are the early days for neuroscience and big data, but we can see the challenges coming. This is not the first research community to face big data challenges; climate and high energy physics have been there and overcome many of the same issues," says Prabhat, who leads NERSC's Data & Analytics Services Group.
Berkeley Lab is well positioned to help neuroscientists address these challenges because of its long tradition of interdisciplinary science, Prabhat adds. DOE facilities like NERSC and the Energy Sciences Network (ESnet) have worked closely with Lab computer scientists to help a range of science communities—from astronomy to battery research—collaborate and manage and archive their data. Berkeley Lab mathematicians have also helped researchers in various scientific disciplines develop new tools and methods for data analysis on supercomputers.
"Harnessing the power of HPC resources will require neuroscientists to work closely with computer scientists and will take time, so we recommend rapid and sustained investment in this endeavor now," says Bouchard. "The insights generated from this effort will have high-payoff outcomes. They will support neuroscience efforts to reveal both the universal design features of a species' brain and help us understand what makes each individual unique."