New method developed for ranking disease-causal mutations within whole genome sequences

Researchers from the University of Washington and the HudsonAlpha Institute for Biotechnology have developed a new method for organizing and prioritizing genetic data. The Combined Annotation–Dependent Depletion, or CADD, method will assist scientists in their search for disease-causing mutation events in human genomes.

The new method is the subject of a paper titled "A general framework for estimating the relative pathogenicity of human genetic variants," published in Nature Genetics.

Current methods of organizing look at just one or a few factors and use only a small subset of the information available. For example, the Encyclopedia Of DNA Elements, or ENCODE, catalogs various types of functional elements in human genomes, while sequence conservation looks for similar or identical sequences that have survived across different species through hundreds of millions of years of evolution. CADD brings all of these data together, and more, into one score in order to provide a ranking that helps researchers discern which variants may be linked to disease and which ones may not.

"CADD will substantially improve our ability to identify disease-causal mutations, will continue to get better as genomic databases grow, and is an important analytical advance needed to better exploit the information content of whole-genome sequences in both clinical and research settings," said Gregory M. Cooper, Ph.D., faculty investigator at HudsonAlpha and one of the collaborators on CADD.

The goal in developing the new approach was to take the overwhelming amount of data available and distill it down into a single score that can be more easily evaluated by a researcher or clinician. To accomplish that, CADD compares and contrasts the properties of 15 million genetic variants separating humans from chimpanzees with 15 million simulated variants. Variants observed in humans have survived natural selection, which tends to remove harmful, disease-causing variants, while simulated variants are not exposed to selection. Thus, by comparing observed to simulated variants, CADD is able to identify those properties that make a variant harmful or disease-causing. C scores have been pre-computed for all 8.6 billion possible single nucleotide variants and are freely available for researchers.

"We didn't know what to expect," Cooper said, "but we were pleasantly surprised that CADD was able not only to be applicable to mutations everywhere in the genome but in fact do a substantially better job in nearly every test that we performed than other metrics."

The CADD method is unique from other algorithms in that it assigns scores to mutations anywhere in human genomes, not just the less-than two percent that encode proteins (the "exome"). This unique attribute will be crucial as whole-genome sequencing becomes routine in both clinical and research settings.

More information: www.nature.com/ng/journal/vaop… nt/full/ng.2892.html

Related Stories

Why is type 2 diabetes an increasing problem?

Jan 09, 2014

Contrary to a common belief, researchers have shown that genetic regions associated with increased risk of type 2 diabetes were unlikely to have been beneficial to people at stages through human evolution.

Research sheds new light on heritability of disease

Jan 16, 2014

A group of international researchers, led by a research fellow in the Harvard Medical School-affiliated Institute for Aging Research at Hebrew SeniorLife, published a paper today in Cell describing a study aimed at better ...

Recommended for you

Mysterious esophagus disease is autoimmune after all

11 hours ago

(Medical Xpress)—Achalasia is a rare disease – it affects 1 in 100,000 people – characterized by a loss of nerve cells in the esophageal wall. While its cause remains unknown, a new study by a team of researchers at ...

Diagnostic criteria for Christianson Syndrome

Jul 21, 2014

Because the severe autism-like condition Christianson Syndrome was only first reported in 1999 and some symptoms take more than a decade to appear, families and doctors urgently need fundamental information ...

New technique maps life's effects on our DNA

Jul 20, 2014

Researchers at the BBSRC-funded Babraham Institute, in collaboration with the Wellcome Trust Sanger Institute Single Cell Genomics Centre, have developed a powerful new single-cell technique to help investigate how the environment ...

User comments