Sieving through 'junk' DNA reveals cancer-causing genetic mutations

October 3, 2013, Wellcome Trust Sanger Institute
Three-dimensional view of human regulatory network with grey edges showing connections between transcription factors (TF) and their target genes. Green nodes represent genes with HighD SNPs (showing high allele frequency difference among human populations) in their promoters. Size of green nodes scaled based on their degree centrality. Nodes with higher centrality are bigger and tend to be in the center. This movie shows HighD sites tend to occur in hub promoters. Credit: Vaja Liluashvili, Zeynep H. Gümüş

Researchers can now identify DNA regions within non-coding DNA, the major part of the genome that is not translated into a protein, where mutations can cause diseases such as cancer.

Their approach reveals many potential genetic variants within non-coding DNA that drive the of a variety of different cancers. This approach has great potential to find other disease-causing variants.

Unlike the coding region of the genome where our 23,000 protein-coding genes lie, the non-coding region - which makes up 98% of our genome – is poorly understood. Recent studies have emphasised the biological value of the non-coding regions, previously considered 'junk' DNA, in the regulation of proteins. This new information provides a starting point for researchers to sieve through the non-coding regions and identify the most functionally important regions.

"Our technique allows scientists to focus in on the most functionally important parts of the non-coding regions of the genome," says Professor Mark Gerstein, senior author from the University of Yale. "This is not just beneficial for , but can be extended to other genetic diseases too."

The team used the full set of genetic variants from the first phase of the 1000 Genomes Project, together with information about the non-coding regions generated by the ENCODE Project, and identified regions that did not accumulate much variation. Protein-coding genes play a crucial role in human survival and fitness, and are under strong 'purifying' selection, which removes variation. The team found that some non-coding DNA regions showed almost the same low levels of variation as protein-coding genes, and called these 'ultrasensitive' regions.

Within the ultrasensitive regions, they looked at specific single DNA letters that, when altered, caused the greatest disturbance to the genetic region. If this non-coding, ultrasensitive region is central to a network of many related genes, variation can cause a greater knock-on effect, resulting in disease.

They integrated all this information to develop a computer workflow known as FunSeq. This system prioritises genetic variants in the non-coding regions based on their predicted impact on human disease. "Our method is a practical and successful way to screen for purifying selection in non-coding regions of the genome using freely available data such as those from the ENCODE and 1000 Genomes Projects," says Dr Yali Xue, author from the Wellcome Trust Sanger Institute. "It really shows the value of these large-scale open access data-sets."

The team applied FunSeq to 90 cancer genomes including breast cancer, prostate cancer and brain tumours, and found nearly 100 potential non-coding cancer driving variants. In the breast , for example, they found a single DNA letter change that seems to have great impact on the development of . This single letter change occurs in an ultrasensitive region that is central to a network of many related genes.

"Although we see that the first effective use of our tool is for genomes, this method can be applied to find any potential disease-causing variant in the non-coding regions of the genome," says Dr Chris Tyler-Smith, lead author from the Wellcome Trust Sanger Institute. "We are excited about the vast potential of this method to find further disease-causing, and also beneficial variants, in these crucial but unexplored areas of our ."

Explore further: Whole DNA sequencing reveals mutations, new gene for blinding disease

More information: Ekta Khurana, Yao Fu, Vincenza Colonna, Xinmeng Jasmine Mu et al (2013). "Integrative annotation of variants from 1,092 humans: application to cancer genomics" Advanced online publication in Science, 03 October, 2013.

Related Stories

Whole DNA sequencing reveals mutations, new gene for blinding disease

September 16, 2013
Retinitis pigmentosa (RP) is a genetic disease that causes progressive loss of vision and is caused by mutations in more than 50 genes. Conventional methods for identification of both RP mutations and novel RP genes involve ...

Origins of genomic 'dark matter' discovered

September 18, 2013
A duo of scientists at Penn State University has achieved a major milestone in understanding how genomic "dark matter" originates. This "dark matter"—called non-coding RNA—does not contain the blueprint for making proteins ...

Exploring lincRNA's role in breast cancer

April 8, 2013
Once considered part of the "junk" of our genome, much of the DNA between protein-coding genes is now known to be transcribed. New findings by scientists at Fox Chase Cancer Center have identified several dozen transcripts ...

One step closer to understanding biology behind genetic variants linked to blood cell traits

April 17, 2013
(Medical Xpress)—Researchers at the Wellcome Trust Sanger Institute and University of Cambridge have unpicked genetic variants that affect the formation of blood cells. They found that around a third of the variants play ...

Recommended for you

Team identifies genetic defect that may cause rare movement disorder

February 22, 2018
A Massachusetts General Hospital (MGH)-led research team has found that a defect in transcription of the TAF1 gene may be the cause of X-linked dystonia parkinsonism (XDP), a rare and severe neurodegenerative disease. The ...

Defects on regulators of disease-causing proteins can cause neurological disease

February 22, 2018
When the protein Ataxin1 accumulates in neurons it causes a neurological condition called spinocerebellar ataxia type 1 (SCA1), a disease characterized by progressive problems with balance. Ataxin1 accumulates because of ...

15 new genes identified that shape human faces

February 20, 2018
Researchers from KU Leuven (Belgium) and the universities of Pittsburgh, Stanford, and Penn State have identified 15 genes that determine facial features. The findings were published in Nature Genetics.

New algorithm can pinpoint mutations favored by natural selection in large sections of the human genome

February 20, 2018
A team of scientists has developed an algorithm that can accurately pinpoint, in large regions of the human genome, mutations favored by natural selection. The finding provides deeper insight into how evolution works, and ...

New software helps detect adaptive genetic mutations

February 20, 2018
Researchers from Brown University have developed a new method for sifting through genomic data in search of genetic variants that have helped populations adapt to their environments. The technique, dubbed SWIF(r), could be ...

Highly mutated protein in skin cancer plays central role in skin cell renewal

February 20, 2018
Approximately once a month, our skin completely renews itself. If this highly coordinated process goes awry, it can lead to a variety of skin diseases, ranging from skin cancer to psoriasis. Cells lining such organs as skin ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.