Study shows higher than expected sequencing errors in public databases

February 17, 2017 by Bob Yirka, Medical Xpress report
A depiction of the double helical structure of DNA. Its four coding units (A, T, C, G) are color-coded in pink, orange, purple and yellow. Credit: NHGRI

(Medical Xpress)—A team of researchers with New England Biolabs Inc. (NEB) has found that sequenced DNA samples held in public databases had higher than expected low-frequency mutation error rates. In their paper published in the journal Science, the team describes how they created an algorithm that is able to calculate an error rate for samples in a database and what it showed when run on two public genome databases.

Researchers involved in studying the role DNA plays in cell mutations that lead to cancerous tumors rely on the accuracy of databases that hold sequencing information—those looking for commonalities, for example, among different groups of people rely on information in such databases when attempting to isolate trends. Such studies involve comparing the genomes of different people with low-frequency mutations versus the general population and using what they find to build cancer datasets. But now, the accuracy of public databases has been called into question by work done by the team at NEB, which in turn calls into question the accuracy of the cancer datasets.

To measure the of a given dataset, the researchers created an algorithm that could be used to count the numbers of sequences showing mutations due to damage during the sequencing process versus those that happened naturally. The team then used their algorithm to calculate error rates for several public databases—most notably the 1000 Genomes Project and part of the TCGA database—they report that they found error rates of 41 percent and 73 percent respectively.

The researchers note that their algorithm is not capable of revealing the source of unnatural damage, but suggest it is likely due to certain sample preparation techniques used prior to sequencing. They also point out that other algorithms have been developed for sequencers to test their own work for errors, but due to lack of a compelling reason, they have not been widely used. They suggest DNA sequencers begin doing so. They also note that new tools have been developed that could help minimize DNA damage during preparation and that their use could improve the of public databases.

Explore further: Web app helps researchers explore cancer genetics

More information: Lixin Chen et al. DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification, Science (2017). DOI: 10.1126/science.aai8690

Abstract
Mutations in somatic cells generate a heterogeneous genomic population and may result in serious medical conditions. Although cancer is typically associated with somatic variations, advances in DNA sequencing indicate that cell-specific variants affect a number of phenotypes and pathologies. Here, we show that mutagenic damage accounts for the majority of the erroneous identification of variants with low to moderate (1 to 5%) frequency. More important, we found signatures of damage in most sequencing data sets in widely used resources, including the 1000 Genomes Project and The Cancer Genome Atlas, establishing damage as a pervasive cause of sequencing errors. The extent of this damage directly confounds the determination of somatic variants in these data sets.

Related Stories

Web app helps researchers explore cancer genetics

July 23, 2015
Brown University computer scientists have developed a new interactive tool to help researchers and clinicians explore the genetic underpinnings of cancer.

An algorithm is sped up to predict harmful effects from specific gene mutations

May 6, 2016
In 2001, researchers developed a formula, or algorithm, that predicts whether a specific change in a gene sequence can result in harmful effects. While useful, the algorithm was slow; the computations underpinning these predictions ...

Recommended for you

How incurable mitochondrial diseases strike previously unaffected families

January 15, 2018
Researchers have shown for the first time how children can inherit a severe - potentially fatal - mitochondrial disease from a healthy mother. The study, led by researchers from the MRC Mitochondrial Biology Unit at the University ...

Genes that aid spinal cord healing in lamprey also present in humans

January 15, 2018
Many of the genes involved in natural repair of the injured spinal cord of the lamprey are also active in the repair of the peripheral nervous system in mammals, according to a study by a collaborative group of scientists ...

The coming of age of gene therapy: A review of the past and path forward

January 11, 2018
After three decades of hopes tempered by setbacks, gene therapy—the process of treating a disease by modifying a person's DNA—is no longer the future of medicine, but is part of the present-day clinical treatment toolkit. ...

Large-scale study to pinpoint genes linked to obesity

January 10, 2018
It's not just diet and physical activity; your genes also determine how easily you lose or gain weight. In a study published in the January issue of Nature Genetics, researchers at the Icahn School of Medicine at Mount Sinai ...

Identical twins can share more than identical genes

January 9, 2018
An international group of researchers has discovered a new phenomenon that occurs in identical twins: independent of their identical genes, they share an additional level of molecular similarity that influences their biological ...

Hereditary facial features could be strongly influenced by a single gene variant, a new study finds

January 9, 2018
Do you have your grandmother's eyes? Or your father's nose? A new study by the Universities of Oxford and Surrey has uncovered variations in singular genes that have a large impact on human facial features, paving the way ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

papuebiswas
not rated yet Feb 27, 2017
I believe that partnerships will be the key. My company has also partnered with an EHR provider to leverage our offerings. I will take review more information about Aprima!.More Information

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.