Researchers bring order to big data of human biology

April 27, 2015
The functional genetic network shown is just one of the 144 such networks identified for a diverse set of human tissues and cell types. Credit: (c) Simons Center for Data Analysis

A multi-year study led by researchers from the Simons Center for Data Analysis (SCDA) and major universities and medical schools has broken substantial new ground, establishing how genes work together within 144 different human tissues and cell types in carrying out those tissues' functions.

The paper, to be published online by Nature Genetics on April 27, also demonstrates how computer science and statistical methods may combine to aggregate and analyze very large—and stunningly diverse—genomic 'big-data' collections.

Led by Olga Troyanskaya, deputy director for genomics at SCDA, the team collected and integrated data from about 38,000 genome-wide experiments (from an estimated 14,000 publications). These datasets necessarily contain not only information about cells' RNA/protein functions, but also information from individuals diagnosed with a variety of illnesses.

Using integrative computational analysis, the researchers first isolated the functional genetic interconnections contained in these rich datasets for various tissue types. Then, combining that tissue-specific functional signal with the relevant disease's DNA-based genome-wide association studies (GWAS), the researchers were able to identify statistical associations between and diseases that would otherwise be undetectable.

The resulting technique, which they called a 'network-guided association study,' or NetWAS, thus integrates quantitative genetics with to increase the power of GWAS and identify genes underlying complex human diseases. And because the technique is completely data-driven, NetWAS avoids bias toward better-studied genes and pathways, permitting discovery of novel associations.

SCDA director Leslie Greengard says, "Olga and her collaborators have demonstrated that extraordinary results can be achieved by merging deep biological insight with state-of-the-art computational methods, and applying them to large-scale, noisy and heterogeneous datasets."

The result of their efforts was 144 functional gene interaction networks for organs as diverse as the kidney, the liver and the whole brain. The paper goes on to describe functional gene disruptions for diseases such as hypertension, diabetes and obesity.

Importantly, while such functional gene interaction networks had already been established in animal models, this feat had not yet been accomplished—and could not have been accomplished without 'big data'— in human tissue. Many human cell types important to disease cannot be studied by traditional direct experimentation, so the ability to instead work with these rich datasets was a critical workaround.

"A key challenge in human biology is that genetic circuits in human tissues and cell types are very difficult to study experimentally," says Troyanskaya, who also is a professor in the computer science department and the Lewis-Sigler Institute for Integrative Genomics at Princeton University. "For example, the podocyte cells in the kidneys that perform the kidney's filtering function cannot be isolated for study in the lab, nor can the function of genes be identified by genome-scale experiments. Yet we need to understand how proteins interact in these cells if we want to understand and treat . Our approach mined these big data collections to build a map of how genetic circuits function in the podocyte cells, and in many other disease-relevant tissues and ."

These findings have important implications for our understanding of normal gene function, but also for drug use and development: Causal or target genes may be better identified for treatment, and previously unexpected drug interactions and disruptions may be anticipated. "Biomedical researchers can use these networks and the pathways that they uncover to understand drug action and side effects in the context of specific disease-relevant tissues, and to repurpose drugs," Troyanskaya says. "These networks can also be useful for understanding how various therapies work and to help with developing new therapies."

The researchers have also created an online resource so that other scientists may use NetWAS and access the tissue-specific networks. The team created an interactive server, the Genome-scale Integrated Analysis of Networks in Tissues, or GIANT. GIANT allows users to explore the networks, compare how vary across tissues, and analyze data from genetic studies to find genes that cause disease.

Aaron K. Wong, a data scientist at SCDA and formerly a graduate student in the computer science department at Princeton, led the way in creating GIANT. "Our goal was to develop a resource that was accessible to biomedical researchers," he says. "For example, with GIANT, researchers studying Parkinson's disease can search the substantia nigra network, which represents the brain region affected by Parkinson's, to identify new genes and pathways involved in the disease." Wong is one of three co-first authors of the paper.

The paper's other two co-first authors are Arjun Krishnan, a postdoctoral fellow at the Lewis-Sigler Institute; and Casey S. Greene, assistant professor of genetics at Dartmouth College, who was a postdoctoral fellow with the Troyanskaya group from 2009 to 2012. Other key collaborators on this study were Emanuela Ricciotti, Garret A. FitzGerald and Tilo Grosser of the pharmacology department and the Institute for Translational Medicine and Therapeutics at the Perelman School of Medicine, University of Pennsylvania; Daniel I. Chasman of Brigham and Women's Hospital and Harvard Medical School in Boston; and Kara Dolinski at the Lewis-Sigler Institute at Princeton University.

"This is an exciting time in biomedical research, and I believe we are still at the early stages of developing new ways to think about biological networks and their control," Greengard says.

Explore further: Nano-dissection identifies genes involved in kidney disease

More information: Understanding multicellular function and disease with human tissue-specific networks, Nature Genetics, http://dx.doi.org/10.1038/ng.3259

Related Stories

Nano-dissection identifies genes involved in kidney disease

October 4, 2013
Understanding how genes act in specific tissues is critical to our ability to combat many human diseases, from heart disease to kidney failure to cancer. Yet isolating individual cell types for study is impossible for most ...

Novel technique for cell lineage-specific gene-expression analysis

February 28, 2014
Before doctors like Matthias Kretzler can begin using the results of molecular research to treat patients, they need science to find an effective way to match genes with the specific cells involved in disease. As Kretzler ...

Researchers generate a reference map of the human epigenome

February 18, 2015
The sequencing of the human genome laid the foundation for the study of genetic variation and its links to a wide range of diseases. But the genome itself is only part of the story, as genes can be switched on and off by ...

New strategy for mapping regulatory networks associated with multi-gene diseases

April 23, 2015
Scientists at the University of Massachusetts Medical School have applied a powerful tool in a new way to characterize genetic variants associated with human disease. The work, published today in Cell, will allow scientists ...

First comprehensive atlas of human gene activity released

March 26, 2014
A large international consortium of researchers has produced the first comprehensive, detailed map of the way genes work across the major cells and tissues of the human body. The findings describe the complex networks that ...

Recommended for you

A math concept from the engineering world points to a way of making massive transcriptome studies more efficient

November 17, 2017
To most people, data compression refers to shrinking existing data—say from a song or picture's raw digital recording—by removing some data, but not so much as to render it unrecognizable (think MP3 or JPEG files). Now, ...

US scientists try first gene editing in the body

November 15, 2017
Scientists for the first time have tried editing a gene inside the body in a bold attempt to permanently change a person's DNA to try to cure a disease.

Genetic mutation in extended Amish family in Indiana protects against aging and increases longevity (Update)

November 15, 2017
The first genetic mutation that appears to protect against multiple aspects of biological aging in humans has been discovered in an extended family of Old Order Amish living in the vicinity of Berne, Indiana, report Northwestern ...

Genetic variant prompts cells to store fat, fueling obesity

November 13, 2017
Obesity is often attributed to a simple equation: People are eating too much and exercising too little. But evidence is growing that at least some of the weight gain that plagues modern humans is predetermined. New research ...

Discovering a protein's role in gene expression

November 10, 2017
Northwestern Medicine scientists have discovered that a protein called BRWD2/PHIP binds to histone lysine 4 (H3K4) methylation—a key molecular event that influences gene expression—and demonstrated that it does so via ...

Twin study finds genetics affects where children look, shaping mental development

November 9, 2017
A new study co-led by Indiana University that tracked the eye movement of twins finds that genetics plays a strong role in how people attend to their environment.

3 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

PeterKinnon
not rated yet Apr 27, 2015
This is a valuable work in its own right and has great potential in the analysis of cell mechanisms and interactions.

Even more importantly, underlines the reality that biological systems are, at all levels, network functions. That the cell, not the genome, is the fundamental unit of inheritance.

However the first sentence of the article is misleading in averring that " genework together within 144 different human tissues and cell types in carrying out those tissues' functions."

Genes do not "work together" in doing anything. Genes are simply protein recipes and entirely passive.

They are used by the innumerable active components of cell machinery to manufacture specific structures and additional machinery in response from signals received from their environment.

This interpretation in terms of networks is expanded upon in my latest book ""The Intricacy Generator: Pushing Chemistry and Geometry Uphill". Available as 336 page illustrated paperback from Amazon,
JVK
1 / 5 (2) Apr 27, 2015
Nutrient-dependent RNA-directed DNA methylation and RNA-mediated amino acid substitutions link the conserved molecular mechanisms of biophysically constrained cell type differentiation in all genera.

See my invited review of nutritional epigenetics "Nutrient-dependent pheromone-controlled ecological adaptations: from atoms to ecosystems" http://figshare.c...s/994281

See also the examples from model organisms that link a single amino acid substitution to differences in the morphological and behavioral phenotypes of a modern human population from what is known about cell type differentiation during the life history transitions of the honeybee model organism and epigenetically effected metabolic and genetic networks.

Nutrient-dependent/pheromone-controlled adaptive evolution: a model http://www.ncbi.n...3960065/
JVK
1 / 5 (2) Apr 27, 2015
See also: "Oppositional COMT Val158Met effects on resting state functional connectivity in adolescents and adults" http://dx.doi.org...4-0895-5

The Val158Met amino acid substitution can be placed into the context of everything currently known about links from nutritional epigenetics to pharmacogenomics and "Precision Medicine."

Clinically Actionable Genotypes Among 10,000 Patients With Preemptive Pharmacogenomic Testing http://www.medsca...24253661

But first, evolutionary theorists must learn the difference between mutations, which perturb protein folding, and amino acid substitutions that stabilize the organized genomes of species from microbes to man via their fixation in the context of nutrient-dependent pheromone-controlled reproduction.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.