Disease-associated genes routinely missed in some genetic studies

April 21, 2017 by Sam Sholtis
Disease-associated genes routinely missed in some genetic studies
Penn State researchers identified 832 genes that have low coverage across multiple whole-exome sequencing platforms. These genes are associated with leukemia, psoriasis, heart failure and other diseases, and may be missed by researchers using whole-exome sequencing to study these diseases. Credit: Penn State University, Carley LaVelle

Whole-exome DNA sequencing—a technology that saves time and money by sequencing only protein-coding regions and not the entire genome—may routinely miss detecting some genetic variations associated with disease, according to Penn State researchers who have developed new ways to identify such omissions.

Whole-exome sequencing has been used in many studies to identify associated with disease, and by clinical labs to diagnose patients with genetic disorders. However, the new research shows that these studies may routinely miss mutations in a subset of disease-causing genes—associated with leukemia, psoriasis, heart failure and others—that occur in regions of the genome that are read less often by the cost-saving technology. A paper describing the research appeared online April 13 in the journal Scientific Reports.

"Although it was known that coverage—the average number of times a given piece of DNA is read during sequencing—could be uneven in , our new methods are the first to really quantify this," said Santhosh Girirajan, assistant professor of biochemistry and molecular biology and of anthropology at Penn State and an author of the paper. "Adequate coverage—often as many as 70 or more reads for each piece of DNA—increases our confidence that the sequence is accurate, and without it, it is nearly impossible to make confident predictions about the relationship between a mutation in a gene and a disease. In our study, we found 832 genes that have systematically low coverage across three different sequencing platforms, meaning that these genes would be missed in disease studies."

The researchers developed two different methods to identify low-coverage regions in whole-exome sequence data. The first method identifies regions with inconsistent coverage compared to other regions in the genome from multiple samples. The second method calculates the number of low-coverage regions among different samples in the same study. They have packaged both methods into an open-source software for other researchers to use.

"Even when the average coverage in a whole-exome sequencing study was high, some regions appeared to have systematically low-coverage," said Qingyu Wang, a graduate student at Penn State at the time of the research and the first author of the paper.

Low-coverage regions may result from limited precision in whole-exome sequencing technologies due to certain genomic features. Highly-repetitive stretches of DNA—regions of the genome where the same simple sequence of As, Ts, Cs and Gs can be repeated many times—can prevent the sequencer from reading the DNA properly. Indeed, the study showed that at least 60 percent of low-coverage genes occur near DNA repeats. As an example, the gene MAST4 contains a repeated sequence element that leads to a three-fold reduction in coverage compared to non-repeating sequences. Even when other genes have sufficient coverage, this of the MAST4 gene falls well below the recommended coverage to detect genetic variations in these studies.

"One solution to this problem is for researchers to use , which examines all base pairs of DNA instead of just the regions that contain genes," said Girirajan. "Our study found that whole-genome data had significantly fewer low-coverage genes than whole-exome data, and its is more uniformly distributed across all parts of the genome. However, the costs of whole-exome sequencing are still significantly lower than whole-genome sequencing. Until the costs of whole-genome sequencing is no longer a barrier, human genetics researchers should be aware of these limitations in whole-exome sequencing technologies."

Explore further: Whole genome or exome sequencing: An individual insight

More information: Qingyu Wang et al. Novel metrics to measure coverage in whole exome sequencing datasets reveal local and global non-uniformity, Scientific Reports (2017). DOI: 10.1038/s41598-017-01005-x

Related Stories

Whole genome or exome sequencing: An individual insight

June 27, 2013
Focusing on parts rather than the whole, when it comes to genome sequencing, might be extremely useful, finds research in BioMed Central's open access journal Genome Medicine. The research compares several sequencing technologies ...

Using RNA sequencing to diagnose patients with rare muscle conditions

April 20, 2017
(Medical Xpress)—An international team of researchers has developed a way to use RNA sequencing to help in diagnosing patients with rare genetic muscle conditions. In their paper published in the journal Science Translational ...

Poor coverage of specific gene sets in exome sequencing gives cause for concern

June 1, 2014
Researchers have analysed 44 exome datasets from four different testing kits and shown that they missed a high proportion of clinically relevant regions. At least one gene in each exome method was missing more than 40 percent ...

Study concludes insurers should provide better coverage for cutting-edge genetic test

April 6, 2016
UCLA researchers have found that a state-of-the-art molecular genetic test greatly improves the speed and accuracy with which they can diagnose neurogenetic disorders in children and adults. The discovery could lead directly ...

New insights into human genetic variation revealed

August 17, 2016
Published in today's edition of Nature, the research led by Dr Monkol Lek of the University of Sydney and Dr Daniel MacArthur of The Broad Institute of MIT and Harvard Universities reveals patterns of genetic variation worldwide ...

Study highlights need for better characterized genomes for clinical sequencing

March 1, 2016
A new study that assesses the accuracy of modern human-genome-sequencing technologies found that some medically significant portions of an individual's DNA blueprint are situated in complex, hard-to-analyze regions that are ...

Recommended for you

Maternal diet may program child for disease risk, but better nutrition later can change that

October 20, 2017
Research has shown that a mother's diet during pregnancy, particularly one that is high-fat, may program her baby for future risk of certain diseases such as diabetes. A new study from nutrition researchers at the University ...

New gene editing approach for alpha-1 antitrypsin deficiency shows promise

October 20, 2017
A new study by scientists at UMass Medical School shows that using a technique called "nuclease-free" gene editing to correct cells with the mutation that causes a rare liver disease leads to repopulation of the diseased ...

Researchers drill down into gene behind frontotemporal lobar degeneration

October 19, 2017
Seven years ago, Penn Medicine researchers showed that mutations in the TMEM106B gene significantly increased a person's risk of frontotemporal lobar degeneration (FTLD), the second most common cause of dementia in those ...

New clues to treat Alagille syndrome from zebrafish

October 18, 2017
A new study led by researchers at Sanford Burnham Prebys Medical Discovery Institute (SBP) identifies potential new therapeutic avenues for patients with Alagille syndrome. The discovery, published in Nature Communications, ...

Genetic variants associated with obsessive-compulsive disorder identified

October 18, 2017
(Medical Xpress)—An international team of researchers has found evidence of four genes that can be linked to obsessive-compulsive disorder (OCD). In their paper published in the journal Nature Communications, the group ...

An architect gene is involved in the assimilation of breast milk

October 17, 2017
A family of "architect" genes called Hox coordinates the formation of organs and limbs during embryonic life. Geneticists from the University of Geneva (UNIGE) and the Swiss Federal Institute of Technology in Lausanne (EPFL), ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.