Compiling big data in a human-centric way

May 11, 2017
A depiction of the double helical structure of DNA. Its four coding units (A, T, C, G) are color-coded in pink, orange, purple and yellow. Credit: NHGRI

When a group of researchers in the Undiagnosed Disease Network at Baylor College of Medicine realized they were spending days combing through databases searching for information regarding gene variants, they decided to do something about it. By creating MARRVEL (Model organism Aggregated Resources for Rare Variant ExpLoration) they are now able to help not only their own lab but also researchers everywhere search databases all at once and in a matter of minutes.

This collaborative effort among Baylor, the Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital and Harvard Medical School is described in the latest online edition of the American Journal of Human Genetics.

Big data search engine

"One big problem we have is that tens of thousands of variants and phenotypes are spread throughout a number of databases, each one with their own organization and nomenclature that aren't easily accessible," said Julia Wang, an M.D./Ph.D. candidate in the Medical Scientist Training Program at Baylor and a McNair Student Scholar in the Bellen lab, as well as first author on the publication. "MARRVEL is a way to assess the large volume of data, providing a concise summary of the most relevant information in a rapid user-friendly format."

MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER, all separate databases to which researchers across the globe have contributed, sharing tens of thousands of human genome variants and phenotypes. Since there is not a set standard for recording this type of information, each one has a different approach and searching each database can yield results organized in different ways. Similarly, decades of research in various model organisms, from mouse to yeast, are also stored in their own individual databases with different sets of standards.

Dr. Zhandong Liu, assistant professor in pediatrics - neurology at Baylor, a member of the Jan and Dan Duncan Neurological Research Institute at Texas Children's and co-corresponding author on the publication, explains that MARRVEL acts similar to an internet search engine.

"This program helps to collate the information in a common language, drawing parallels and putting it together on one single page. Our program curates model organism specific databases to concurrently display a concise summary of the data," Liu said.

Supporting researchers

A user can first search for a gene or variant, Wang explains. Results may include what is known about this gene overall, whether or not that gene is associated with a disease, whether it is highly occurring in the general population and how it is affected by certain mutations.

"MARRVEL helps to facilitate analysis of human genes and variants by cross-disciplinary integration of 18 million records so we can speed up the discovery process through computation," Liu said. "All this information is basically inaccessible unless researchers can access it efficiently and apply it to their own work to find causes, treatments and hopefully identify new diseases."

Collaboration

This project started as a necessity for the Model Organism Screening Center for the Undiagnosed Disease Network at Baylor, but as it grew, the group began reaching out to researchers in different disciplines for feedback on how MARRVEL might benefit them.

"This program is just the start. I think our tool is going to be a model for us to help clinicians and basic scientists more efficiently use the information already publicly available," Wang said. "It will help us understand and process all of the different mutations that researchers are discovering."

"The most exciting part is how this project is bringing so many different researchers together," Liu said. "We are working with labs we might not have normally collaborated with, trying to put together a puzzle of all this data."

Both Wang and Liu are thankful to the contributions from the genetics communities allowing them access to the databases as they developed MARRVEL.

Others who contributed to the findings include Drs. Rami Al-Ouran, Seon-Young Kim, Ying-Wooi Wan, Michael Wangler, Shinya Yamamoto, Hsiao-Tuan Chao, and Hugo Bellen (Howard Hughes Medical Institute at Baylor) all with Baylor College of Medicine; Yanhui Hu, Aram Comjean, Stephanie E. Mohr, and Norbert Perrimon (Howard Hughes Medical Institute at Harvard Medical School) all with Harvard Medical School.

Explore further: Researchers develop novel system for cataloging cancer gene variants

More information: Julia Wang et al. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome, The American Journal of Human Genetics (2017). DOI: 10.1016/j.ajhg.2017.04.010

Related Stories

Researchers develop novel system for cataloging cancer gene variants

November 16, 2016
The discovery of variations in genes in tumor samples has been critical to the understanding of how cancer develops and spreads, and how to effectively treat it. Now, a multi-institutional group of researchers from the National ...

Distinct neurological syndromes can be the result of variations in gene ATAD3A

September 15, 2016
A team of scientists from a number of institutions around the world, including Baylor College of Medicine, has discovered that rare neurological syndromes for which there was no cause can be the result of variations in the ...

Genes Nardilysin and OGDHL linked to human neurological conditions

December 22, 2016
An international team of scientists has discovered that the gene, OGDHL, a key protein required for normal function of the mitochondria—the energy-producing factory of the cell—and its chaperone, nardilysin (NRD1) are ...

New tool uses genetic and clinical information to find the root cause of unexplained illnesses

April 26, 2017
An algorithm developed by Saudi Arabia's King Abdullah University of Science and Technology (KAUST) scientists has the potential to help patients with mysterious ailments find genetic causes for their undiagnosed diseases.

OTUD6B gene mutations cause intellectual and physical disability

March 23, 2017
An international team of researchers from institutions around the world, including Baylor College of Medicine, has discovered that mutations of the OTUD6B gene result in a spectrum of physical and intellectual deficits. This ...

Medical mystery solved in record time

April 17, 2017
In a study published today in PLoS ONE, a team of researchers reports solving a medical mystery in a day's work. In record-time detective work, the scientists narrowed down the genetic cause of intellectual disability in ...

Recommended for you

Association found between abnormal cerebral connectivity and variability in the PPARG gene in developing preterm infants

December 12, 2017
(Medical Xpress)—A team of researchers with King's College London and the National Institute for Health Research Biomedical Research Centre, both in the U.K., has found what they describe as a strong association between ...

Large genetic study links tendency to undervalue future rewards with ADHD, obesity

December 11, 2017
Researchers at University of California San Diego School of Medicine have found a genetic signature for delay discounting—the tendency to undervalue future rewards—that overlaps with attention-deficit/hyperactivity disorder ...

Gene variants identified that may influence sexual orientation in men and boys

December 8, 2017
(Medical Xpress)—A large team of researchers from several institutions in the U.S. and one each from Australia and the U.K. has found two gene variants that appear to be more prevalent in gay men than straight men, adding ...

Disease caused by reduction of most abundant cellular protein identified

December 8, 2017
An international team of scientists and doctors has identified a new disease that results in low levels of a common protein found inside our cells.

Study finds genetic mutation causes 'vicious cycle' in most common form of amyotrophic lateral sclerosis

December 8, 2017
University of Michigan-led research brings scientists one step closer to understanding the development of neurodegenerative disorders such as ALS.

Mutations in neurons accumulate as we age: The process may explain normal cognitive decline and neurodegeneration

December 7, 2017
Scientists have wondered whether somatic (non-inherited) mutations play a role in aging and brain degeneration, but until recently there was no good technology to test this idea. A study published online today in Science, ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

awang714
5 / 5 (1) May 11, 2017
Very cool. This will be a great tool for clinicians and scientists to quickly understand the enormous amount of data we have currently available.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.