New method for finding disease-susceptibility genes

May 28, 2018 by Joo Hyeon Heo, Ulsan National Institute of Science and Technology
Comparison of false discovery controls. False discovery counts (FDR < 0.05) for competitive pathway analysis methods are shown. Credit: UNIST

A new study has resulted in a novel statistical algorithm capable of identifying potential disease genes in a more accurate and cost-effective way. This algorithm is a possible approach for the identification of candidate disease genes, as it works effectively with less genomic data and takes only a minute or two to get results.

This breakthrough has been reported by Professor Dougu Nam and his research team in the School of Life Sciences at UNIST. Their findings were published in Nucleic Acids Research on March 19, 2018.

In the study, the research team presented the novel method and software GSA-SNP2 for enrichment of GWAS P-value data. According to the research team, GSA-SNP2 provides high power, decent type I error control and fast computation by incorporating the random set model and SNP-count adjusted gene score.

"GSA-SNP2 is a powerful and efficient tool for pathway enrichment and network analysis of genome-wide association study (GWAS) summary data," says Professor Nam. "With this algorithm, we can easily identify new drug targets, thereby deepening our understanding of diseases, and unlock new therapies to treat it."

Each individual's genome is a unique combination of DNA sequences that play major roles in determining who we are. This accounts for all individual differences, including susceptibility for disease and diverse phenotypes. Such genetic variations among humans are known as single nucleotide polymorphisms (SNPs). SNPs that correlate with specific diseases could serve as predictive biomarkers to aid the development of new drugs. Through the statistical analysis of GWAS summary data, it is possible to identify the disease-associated SNPs.

Despite the astronomical amounts of money and time invested in the statistical analysis of SNP data, the conventional SNP detection technologies have been unable to identify all possible SNPs. This is because most of the conventional methods for detecting SNPs are designed to strictly control false positives in the results. Therefore, among tens of thousands of genomics data and hundreds of thousands of SNPs analyzed, the number of markers described within a candidate disease gene often reaches several tens.

"Although controlling false positive SNPs is needed for the correct interpretation of the results, too much filtering may hamper its usefulness in drug development," says Professor Nam. "Therefore, enhanced statistical power is essential to practical statistical algorithms."

The team aimed to develop an algorithm that improves the statistical predictability while maintaining accurate control of false positives. To do this, they applied the monotone Cubic Spline trend curve to the gene score via the competitive pathway analysis for .

In a comparative study using simulated and real GWAS data, GSA-SNP2 exhibited high power and best prioritized gold standard positive pathways compared with six existing enrichment-based methods and two self-contained methods. Based on these results, the difference between pathway analysis approaches was investigated and the effects of the gene correlation structures on the pathway enrichment analysis were also discussed. In addition, GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies.

Comparison of statistical powers. Powers of competitive pathway analysis methods under the four different simulation settings are represented. Credit: UNIST

According to the research team, GSA-SNP2 provides a greatly improved type I error control by using the SNP-count adjusted gene scores, while nevertheless preserving high statistical power. It also provides both local and global protein interaction networks in the associated pathways, and may facilitate integrated pathway and network analysis of GWAS data.

The research team expects that their GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies.

Explore further: Potential new asthma genes ID'd in genome-wide study

More information: Nucleic Acids Research (2018). DOI: 10.1093/nar/gky175

Related Stories

Potential new asthma genes ID'd in genome-wide study

July 28, 2016
(HealthDay)—Potential new asthma genes have been identified in a genome-wide association study (GWAS) combined with subsequent lung expression quantitative trait loci (eQTL) analysis, according to research published online ...

Three new lung cancer genetic biomarkers are identified in Dartmouth study

October 26, 2017
Both environmental and genetic risk factors contribute to development of lung cancer. Tobacco smoking is the most well-known environmental risk factor associated with lung cancer. A Dartmouth research team led by Yafang Li, ...

Researchers develop reproducibility score for SNPs associated with human disease in GWAS

October 8, 2014
To reduce false positives when identifying genetic variations associated with human disease through genome-wide association studies (GWAS), Dartmouth researchers have identified nine traits that are not dependent on P values ...

Large meta-analysis finds new genes for type 1 diabetes

September 29, 2011
The largest-ever analysis of genetic data related to type 1 diabetes has uncovered new genes associated with the common metabolic disease, which affects 200 million people worldwide. The findings add to knowledge of gene ...

Study reveals genetic basis of quantitative traits and diseases in Japanese population

February 8, 2018
Genome-wide association studies (GWAS) are an emerging method for scientists to identify genes involved in human disease. GWAS searches the whole genome region for small variations, called single nucleotide polymorphisms ...

Recommended for you

Student develops microfluidics device to help scientists identify early genetic markers of cancer

October 16, 2018
As anyone who has played "Where's Waldo" knows, searching for a single item in a landscape filled with a mélange of characters and objects can be a challenge. Chrissy O'Keefe, a Ph.D. student in the Department of Biomedical ...

Researchers use brain cells in a dish to study genetic origins of schizophrenia

October 16, 2018
A study in Biological Psychiatry has established a new analytical method for investigating the complex genetic origins of mental illnesses using brain cells that are grown in a dish from human embryonic stem cells. Researchers ...

Why heart contractions are weaker in those with hypertrophic cardiomyopathy

October 16, 2018
When a young athlete suddenly dies of a heart attack, chances are high that they suffer from familial hypertrophic cardiomyopathy (HCM). Itis the most common genetic heart disease in the US and affects an estimated 1 in 500 ...

Importance of cell cycle and cellular senescence in the placenta discovered

October 15, 2018
Working with researchers from Stanford University and St. Anna Children's Cancer Research, researchers from Jürgen Pollheimer's laboratory at the Medical University of Vienna's Department of Obstetrics and Gynecology have ...

Team's study reveals hidden lives of medical biomarkers

October 12, 2018
What do medical biomarkers do on evenings and weekends, when they might be considered off the clock?

Researchers find a 'critical need' for whole genome sequencing of young cancer patients

October 12, 2018
St. Jude Children's Research Hospital has re-defined the gold standard for diagnostic testing of childhood cancer patients in the precision-medicine era and has implemented the testing for new cancer patients. The findings ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.