Baylor, DNAnexus, Amazon Web Services collaboration enables largest-ever cloud-based analysis of genomic data

October 25, 2013

With their participation in the completion of the largest cloud-based analysis of genome sequence data, researchers from the Baylor College of Medicine Human Genome Sequencing Center are helping to usher genomic scientists and clinicians around the world into a new era of high-level data analysis. (A "cloud" is a virtual network of remote internet servers used to store, manage and process information.)

"The mission of the Baylor Human Genome Sequencing Center is to drive genomics and genomic analysis to be at the leading edge of everything in the field," said Dr. Jeffrey Reid, assistant professor in the Human Genome Sequencing Center at BCM, who led the BCM portion of the project. "In terms of analysis, the future of genomic research and genomic medicine is in the cloud. We are very much going towards more computing and not less."

Together with the Platform-as-a-Service company DNAnexus and Amazon Web Services, the largest provider of , BCM sequenced the DNA of more than 14,000 individuals—3,751 whole genomes and 10,771 whole exomes using next generation sequencing. (An exome contains all the genes in a genome and are the part of the genome that provides the blueprints for proteins.) The individuals whose genetic material was sequenced are part of the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium or CHARGE project aimed at advancing understanding of and the contributions to heart disease and aging.

Reid gave a presentation on the project today (Oct. 25) at the American Society of Human Genetics annual meeting in Boston.

The BCM Human Genome Sequencing Center-developed Mercury pipeline, a semi-automated and modular set of tools for the analysis of next generation sequencing data in both research and clinical contexts, was an integral part of the project. The pipeline identifies mutations from genomic data, setting the stage for determining the significance of these mutations as a cause of serious disease.

Led by Dr. Eric Boerwinkle, professor and director of the Human Genetics Center at The University of Texas Health Science Center at Houston and associate director of the Human Genome Sequencing Center at BCM, the CHARGE project involves more than 300 researchers across five institutions around the world. The cloud-based analysis makes it possible for the large group to have access to an expansive network of data over a server that is HIPAA certified to not compromise patient privacy.

"The collaboration between the CHARGE consortium and the Human Genome Sequencing Center is leading to discovery of those genes contributing to risk of the most important diseases plaguing the U.S. population across all age groups," said Boerwinkle. "Ultimately, these discoveries forge a path toward novel therapeutics and diagnostics. The use of cloud computing and collaboration with DNAnexus is allowing us to achieve our goals faster and in a more cost-effective manner."(Boerwinkle will give an updated presentation November 15 at the Cold Spring Harbor Laboratory's Personal Genomes & Pharmacogenomics Meeting.)

"Having access to this much data was unique," said Reid. "Many institutions do not have the local compute resources and infrastructure to support large scale analysis projects like this one, so we were lucky to come together with DNAnexus and Amazon Web Services to make this project possible."

The project required approximately 2.4 million core-hours of computational time, generating 440 TB (terabytes) of results and nearly a petabyte of storage that took place over a four-week period.

By comparison, the 1,000 genomes project sequenced 2,535 exomes and required 25 TB of data.

"It is very important for us to create a centralized space where researchers from all over the world can come and collaborate with the data," said Reid. "This project creates expansive access to this data over a protected network that will advance research."

Explore further: Functional genetic variation in humans: Comprehensive map published

Related Stories

Functional genetic variation in humans: Comprehensive map published

September 15, 2013
European scientists, led by researchers from the University of Geneva (UNIGE)'s Faculty of Medicine in the context of the GEUVADIS project, today present a map that points to the genetic causes of differences between people. ...

Genome analysis of pancreas tumors reveals new pathway

October 24, 2012
The latest genomic analysis of pancreatic tumors identified two new pathways involved in the disease, information that could be capitalized on to develop new and earlier diagnostic tests for the disease

NIH launches first phase of microbiome cloud project

September 26, 2013
The National Institutes of Health (NIH) has launched the first phase of the Microbiome Cloud Project (MCP), a collaboration with Amazon Web Services that aims to improve access to and analysis of data from the Human Microbiome ...

Ethicists provide framework supporting new recommendations on reporting incidental findings in gene sequencing

May 16, 2013
In a paper published in Science Express, a group of experts led by bioethicists in the Center for Medical Ethics and Health Policy at Baylor College of Medicine provide a framework for the new American College of Medical ...

Whole genome or exome sequencing: An individual insight

June 27, 2013
Focusing on parts rather than the whole, when it comes to genome sequencing, might be extremely useful, finds research in BioMed Central's open access journal Genome Medicine. The research compares several sequencing technologies ...

Recommended for you

Scientists provide insight into genetic basis of neuropsychiatric disorders

July 21, 2017
A study by scientists at the Children's Medical Center Research Institute at UT Southwestern (CRI) is providing insight into the genetic basis of neuropsychiatric disorders. In this research, the first mouse model of a mutation ...

Scientists identify new way cells turn off genes

July 19, 2017
Cells have more than one trick up their sleeve for controlling certain genes that regulate fetal growth and development.

South Asian genomes could be boon for disease research, scientists say

July 18, 2017
The Indian subcontinent's massive population is nearing 1.5 billion according to recent accounts. But that population is far from monolithic; it's made up of nearly 5,000 well-defined sub-groups, making the region one of ...

Mutant yeast reveals details of the aberrant genomic machinery of children's high-grade gliomas

July 18, 2017
St. Jude Children's Research Hospital biologists have used engineered yeast cells to discover how a mutation that is frequently found in pediatric brain tumor high-grade glioma triggers a cascade of genomic malfunctions.

Late-breaking mutations may play an important role in autism

July 17, 2017
A study of nearly 6,000 families, combining three genetic sequencing technologies, finds that mutations that occur after conception play an important role in autism. A team led by investigators at Boston Children's Hospital ...

Newly identified genetic marker may help detect high-risk flu patients

July 17, 2017
Researchers have discovered an inherited genetic variation that may help identify patients at elevated risk for severe, potentially fatal influenza infections. The scientists have also linked the gene variant to a mechanism ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.