Genomic Data Commons at University of Chicago launches new era of cancer data sharing

June 6, 2016, University of Chicago Medical Center

The Genomic Data Commons (GDC), a next-generation platform that enables unprecedented data access, analysis and sharing for cancer research, publicly launched at the University of Chicago on June 6, opening the door to discoveries for this complex set of diseases.

The GDC went live with approximately 4.1 petabytes of data from National Cancer Institute-supported research programs, including some of the largest and most comprehensive cancer genomics datasets in the world—such as The Cancer Genome Atlas and Therapeutically Applicable Research to Generate Effective Treatments—and more than 14,000 anonymized patient cases. One petabyte equals 1 million gigabytes.

Vice President Joe Biden toured the GDC operations center at the University of Chicago in advance of his appearance to announce the project at the annual meeting of the American Society for Clinical Oncology on June 6.

The Data Commons centralizes, standardizes and harmonizes genomic and on a unified and interoperable platform. Cancer researchers can access these data for analyses and submit their own datasets to share with the research community. By making high-quality data broadly accessible, the GDC provides much-needed tools to accelerate studies of the biological mechanisms of cancer and the development of personalized treatments for individual patients.

UChicago developed and operates the Data Commons with NCI funding under a subcontract from Leidos Biomedical Research at the Frederick National Laboratory for Cancer Research, in collaboration with the Ontario Institute for Cancer Research.

Development of the GDC began in 2014 at UChicago's Center for Data Intensive Science (CDIS). Over the past two years, the team has created an innovative suite of tools, software and infrastructure—based on CDIS open-source projects such as the Bionimbus Protected Data Cloud—to curate the massive amounts of data held by the GDC.

"Today, making discoveries from cancer is challenging because diverse research groups analyze different cancer datasets using various methods that are not easily comparable," said GDC principal investigator Robert Grossman, professor of medicine and director of CDIS at UChicago. "The Genomic Data Commons brings together genomic datasets and analyzes the data using a common set of methods so that researchers may more easily make discoveries, and, in this sense, democratizes the analysis of large cancer genomic datasets."

"Big data" is recognized as essential to efforts in understanding and treating cancer. Cancer is as complex as is it is devastating. It involves a host of genetic, lifestyle and environmental factors, and is now known to comprise hundreds of diseases—each with unique features, driving forces and vulnerabilities to treatments. Large sample sizes are required to provide the statistical power to understand which combinations of drugs are effective against which combinations of mutations that drive cancer.

Breaking Barriers

While enormous amounts of genomic and clinical data have been gathered by NCI-funded research, several barriers have prevented researchers from making full use of them. Genomic data from different projects, clinical trials and cancer types are siloed in different locations with local management systems, making data sharing difficult. These large datasets can take months to download, and not all researchers have access to the sophisticated tools needed to study them. In addition, disparate collection and analysis approaches by separate research groups inhibit collaborative work.

The GDC breaks down these barriers by bringing cancer genomics datasets and associated clinical data into one location that any researcher may access. It harmonizes the data with a common set of analytic pipelines to make it easier to study the information, which in the past has typically been available as separate datasets analyzed with separate pipelines. By making these data available using modern computing and network technology, the GDC makes it possible for any researcher to ask new and fundamental questions about cancer.

Built and managed by Grossman's team at the University of Chicago, the GDC will:

  • Serve as a central unified repository for cancer genomic data and associated clinical data.
  • Clean, standardize and harmonize data, as well as provide quality control, so that analyses can be conducted using common algorithms and pipelines.
  • Support basic research and clinical trials by making data easily accessible, findable, interoperable and reusable.
  • Provide powerful data transfer, search, Application Programming Interface (API) and analysis tools to researchers at no cost.

A Foundation for the Future

As the first step in a next generation knowledge system for cancer, the GDC enables and accelerates efforts to identify both high- and low-frequency cancer driver mutations, assists in revealing the genetic determinants of response to therapy, and informs the composition of clinical trial cohorts.

The GDC will help bridge siloes by providing researchers with access to high-quality data, the tools needed to share and study them, and support to submit their own data. It will house data from a new era of programs that will sequence the DNA of patients enrolled in NCI clinical trials. These datasets will lead to a much deeper understanding of which therapies are most effective for different cancers. The GDC will support clinical trials that focus on single patients, known as "n of 1" , and will become an important component in how precision medicine is used to treat individual patients.

The GDC also creates a foundation for future cloud-based technologies that could allow researchers to analyze large-scale datasets and perform experiments remotely, such as through the NCI's Cancer Cloud Pilots Program. In addition, the open-source software being developed by the CDIS has the potential to become a model for data-intensive research efforts for other diseases, such as Alzheimer's and diabetes, which would greatly benefit from similar large-scale, data-driven approaches to develop cures.

"We are at a crossroads today in whether we will have the critical mass of cancer-related data needed to power new discoveries and improve care," Grossman said. "Over time, I expect the GDC will play a more and more important role in providing the data required at the scale required so that precision medicine fulfills its promise."

Explore further: University of Chicago to establish Genomic Data Commons

Related Stories

University of Chicago to establish Genomic Data Commons

December 2, 2014
The University of Chicago is collaborating with the National Cancer Institute to establish the nation's most comprehensive computational facility that stores and harmonizes cancer genomic data generated through NCI-funded ...

Biden unveiling public database for clinical data on cancer

June 6, 2016
Vice President Joe Biden unveiled a public database for clinical data on cancer on Monday that aims to help researchers and doctors better tailor new treatments to individuals.

ICGC brings more genomic health data to researchers on the Amazon Web Services Cloud

November 18, 2015
The International Cancer Genome Consortium (ICGC) announced today that 1,200 encrypted cancer whole genome sequences are now securely available on the Amazon Web Services (AWS) Cloud for access by cancer researchers worldwide.

Big Data can save lives, says leading cancer expert

May 16, 2016
The sharing of genetic information from millions of cancer patients around the world could be key to revolutionising cancer prevention and care, according to a leading cancer expert from Queen's University Belfast.

Investigational ER degrader safe, with early signs of antitumor activity against advanced ER-positive breast cancer

April 21, 2015
The new investigational estrogen receptor (ER) degrader GDC-0810 was safe and tolerable in postmenopausal women with advanced ER-positive breast cancer, and a subset of the women, all of whom were previously treated with ...

Negative cancer trials: Short-term whimper, long-term bang

March 10, 2016
Cancer clinical trials with negative results don't make an immediate splash in the scientific literature, but they do have a long-term impact on cancer research, according to a new study by SWOG, the federally funded international ...

Recommended for you

Mutant cells colonize our tissues over our lifetime

October 18, 2018
By the time we reach middle age, more than half of the oesophagus in healthy people has been taken over by cells carrying mutations in cancer genes, scientists have uncovered. By studying normal oesophagus tissue, scientists ...

Study involving hundreds of patient samples may reveal new treatment options of leukemia

October 17, 2018
After more than five years and 672 patient samples, an OHSU research team has published the largest cancer dataset of its kind for a form of leukemia. The study, "Functional Genomic Landscape of Acute Myeloid Leukemia", published ...

A 150-year-old drug might improve radiation therapy for cancer

October 17, 2018
A drug first identified 150 years ago and used as a smooth-muscle relaxant might make tumors more sensitive to radiation therapy, according to a recent study led by researchers at The Ohio State University Comprehensive Cancer ...

Loss of protein p53 helps cancer cells multiply in 'unfavourable' conditions

October 17, 2018
Researchers have discovered a novel consequence of loss of the tumour protein p53 that promotes cancer development, according to new findings in eLife.

New method uses just a drop of blood to monitor lung cancer treatment

October 17, 2018
Dr. Tasuku Honjo won the 2018 Nobel Prize in physiology or medicine for discovering the immune T-cell protein PD-1. This discovery led to a set of anti-cancer medications called checkpoint inhibitors, one of the first of ...

Researcher fighting breast cancer with light therapy

October 17, 2018
When treatment is working for a patient who is fighting cancer, the light at the end of the tunnel is easier to see.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.