New epidemiology model combines multiple genomic data

April 8, 2014, Brown University
A new statistical model brings critical elements -- single-nucleotide differences in DNA and data on gene expression and methylation -- into the study of associations between genetics and disease. Credit: Brown University

The difference between merely throwing around buzzwords like "personalized medicine" and "big data" and delivering on their medical promise is in the details of developing methods for analyzing and interpreting genomic data. In a pair of new papers, Brown University epidemiologist Yen-Tsung Huang and colleagues show how integrating different kinds of genomic data could improve studies of the association between genes and disease.

The kinds of data Huang integrates are single-nucleotide differences in DNA, called SNPs, data on gene expression, which is how the body puts genes into action, and methylation, a chemical alteration related to expression. All are potentially relevant to whether a person gets sick, but most analyses that connect genomics to disease account for only one. In papers now online in the journals Biostatistics and Annals of Applied Statistics, Huang describes the results of testing the model in analyses of asthma and data.

"Our integrated approach outperforms single-platform approaches," Huang said. "Applied to real data sets, it works."

Improved performance

The statistical model Huang developed with Tyler VanderWeele and Xihong Lin of Harvard, co-authors on the Annals paper, isn't purely statistical. Its structure and assumptions are informed by the underlying biology. SNPs can be directly associated with disease, or that association can be mediated by whether genes, including the ones in which the SNPs reside, are expressed in healthy or sick patients.

The Annals paper describes the model witih SNPs and expression in detail and its application to data connecting the gene ORMDL3 to asthma. Using the model, the authors found 15 SNPs in the gene with significant associations with disease, compared to only five that have been apparent analyzing SNPs alone. The researchers also found that their "p-values," (a measure of an association's statistical significance) were substantially lower, and therefore stronger, when using the combined analyses their model allows, compared to traditional methods that track just one variable or attempt to mix multiple data sets.

They know the model isn't likely just churning up a lot of false positive SNPs because they also tested it against "null" data where it shouldn't find anything, and indeed it didn't.

Valid with different subjects

Huang further the extends the model, and again reports similar results in Biostatistics – new potentially relevant genes and lower p values – in the asthma data set as well as one involving the gene GRB10 and glioblastoma multiforme brain tumors. But this paper makes additional contributions. One of them is showing that the model can be useful even when SNP data and come from different people, as long as the subjects are generally comparable. Another is that it integrates not only SNPs and expression, but also DNA methylation data, which is a of DNA associated with expression.

This is important because and DNA methylation can be tissue dependent. In the case of brain cancer, it's rarely plausible for an epidemiologist to retrieve brain tissue from the same subjects from whom they can more readily sample DNA.

In a new study Huang will conduct with Brown epidemiology colleague Dominique Michaud, he plans to apply the model to new sets of brain cancer data, including DNA from subjects with and without tumors as well as expression data from tissue of people who died, both of brain cancer or other causes.

There could be many other applications as well. The model's general structure of relationships between two variables (one which may mediate the other) and an outcome, he adds, allows it to be applied to similarly structured phenomena, not just to genomics and disease.

"I think our approach is representative of a new framework of data integration," Huang said. "As long as you can lay out your biological question in terms of this kind of mediation , then our approach can help you easily analyze your ."

Explore further: For neurons in the brain, identity can be used to predict location

More information: biostatistics.oxfordjournals.o … tics.kxu014.abstract

Related Stories

For neurons in the brain, identity can be used to predict location

March 24, 2014
Throughout the world, there are many different types of people, and their identity can tell a lot about where they live. The type of job they work, the kind of car they drive, and the foods they eat can all be used to predict ...

Linking risk factors and disease origins in breast cancer

November 20, 2013
Researchers from the Geisel School of Medicine at Dartmouth have found that epigenetic changes to DNA are associated with aging in disease-free breast tissues and are further altered in breast tumors. Epigenetic changes describe ...

Newly identified markers may predict who will respond to breast cancer prevention therapy

June 13, 2013
Genetic variations, known as single nucleotide polymorphisms (SNPs), in or near the genes ZNF423 and CTSO were associated with breast cancer risk among women who underwent prevention therapy with tamoxifen and raloxifene, ...

International consortium discovers two genes that modulate risk of breast and ovarian cancer

April 4, 2014
Today we know that women carrying BCRA1 and BCRA2 gene mutations have a 43% to 88% risk of developing from breast cancer before the age of 70. Taking critical decisions such as opting for preventive surgery when the risk ...

Biostatistics approach to genetics yields new clues to roots of autism

February 3, 2014
(Medical Xpress)—A study is only as good as the tools used to analyze it. One of those tools is statistics, and while biologists and chemists set up and run the experiments, statisticians are at work tinkering with the ...

Scientists develop gene test to accurately classify brain tumors

February 18, 2014
Scientists at The Wistar Institute have developed a mathematical method for classifying forms of glioblastoma, an aggressive and deadly type of brain cancer, through variations in the way these tumor cells "read" genes. Their ...

Recommended for you

Discovery of the 'pioneer' that opens the genome

January 23, 2018
Our genome contains all the information necessary to form a complete human being. This information, encoded in the genome's DNA, stretches over one to two metres long but still manages to squeeze into a cell about 100 times ...

Researchers identify gene responsible for mesenchymal stem cells' stem-ness'

January 22, 2018
Many doctors, researchers and patients are eager to take advantage of the promise of stem cell therapies to heal damaged tissues and replace dysfunctional cells. Hundreds of ongoing clinical trials are currently delivering ...

Genes contribute to biological motion perception and its covariation with autistic traits

January 22, 2018
Humans can readily perceive and recognize the movements of a living creature, based solely on a few point-lights tracking the motion of the major joints. Such exquisite sensitivity to biological motion (BM) signals is essential ...

Peers' genes may help friends stay in school, new study finds

January 18, 2018
While there's scientific evidence to suggest that your genes have something to do with how far you'll go in school, new research by a team from Stanford and elsewhere says the DNA of your classmates also plays a role.

Two new breast cancer genes emerge from Lynch syndrome gene study

January 18, 2018
Researchers at Columbia University Irving Medical Center and NewYork-Presbyterian have identified two new breast cancer genes. Having one of the genes—MSH6 and PMS2—approximately doubles a woman's risk of developing breast ...

A centuries-old math equation used to solve a modern-day genetics challenge

January 18, 2018
Researchers developed a new mathematical tool to validate and improve methods used by medical professionals to interpret results from clinical genetic tests. The work was published this month in Genetics in Medicine.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.