Supercomputer helps researchers interpret genomes

July 2, 2014 by Aaron Dubrow
Picture DNA on Facebook. The image above is a map of links between the genes of the mustard plant Arabidopsis thaliana. Genes involved in the same biological process are connected by lines: red for more certain links, blue for less certain links. "It's not unlike a social network," says biologist Seung Yon Rhee. Credit: Insuk Lee, Michael Ahn, Edward Marcotte, Seung Yon Rhee, Carnegie Institution for Science

Tandem protein mass spectrometry is one of the most widely used methods in proteomics, the large-scale study of proteins, particularly their structures and functions.

Researchers in the Marcotte group at the University of Texas at Austin are using the Stampede supercomputer to develop and test computer algorithms that let them more accurately and efficiently interpret proteomics mass spectrometry data.

The researchers are midway through a project that analyzes the largest animal proteomics dataset ever collected (data equivalent to roughly half of all currently existing shotgun data in the ). These samples span protein extracts from a wide variety of tissues and cell types sampled across the animal tree of life.

The analyses consume considerable computing cycles and require the use of Stampede's large memory nodes, but they allow the group to reconstruct the 'wiring diagrams' of cells by learning how all of the proteins encoded by a genome are associated into functional pathways, systems, and networks. Such models let scientists better define the functions of genes, and link genes to traits and diseases.

"Researchers would usually analyze these sorts of datasets one at a time," Edward Marcotte said. "TACC let us scale this to thousands."

Explore further: The new age of proteomics: An integrative vision of the cellular world

Related Stories

Recommended for you

New insights on triggering muscle formation

April 26, 2017

Researchers at Sanford Burnham Prebys Medical Discovery Institute (SBP) have identified a previously unrecognized step in stem cell-mediated muscle regeneration. The study, published in Genes and Development, provides new ...

Risk of obesity influenced by changes in our genes

April 25, 2017

These changes, known as epigenetic modifications, control the activity of our genes without changing the actual DNA sequence. One of the main epigenetic modifications is DNA methylation, which plays a key role in embryonic ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.