Individuals' medical histories predicted by their noncoding genomes, study finds

February 4, 2016, Stanford University Medical Center
This stylistic diagram shows a gene in relation to the double helix structure of DNA and to a chromosome (right). The chromosome is X-shaped because it is dividing. Introns are regions often found in eukaryote genes that are removed in the splicing process (after the DNA is transcribed into RNA): Only the exons encode the protein. The diagram labels a region of only 55 or so bases as a gene. In reality, most genes are hundreds of times longer. Credit: Thomas Splettstoesser/Wikipedia/CC BY-SA 4.0

Identifying mutations in the control switches of genes can be a surprisingly accurate way to predict a person's medical history, researchers at the Stanford University School of Medicine have found.

When the scientists used the technique to analyze the whole genome sequences of five individuals, they found that a person with narcolepsy had mutations in the regulatory regions of genes controlling alertness; a person with a family history of had mutations in regions controlling genes associated with ; and a person with had mutations in regions controlling circulating in the blood.

"The beauty of having whole genomes available for study is that you can then ask completely agnostic questions," said Gill Bejerano, PhD, an associate professor of developmental biology, of pediatrics and of computer science at Stanford. "We set out to find hidden layers of susceptibility in the regulatory regions of these genomes. We were very pleased that our analysis gave such clear and significant associations between the mutations and medical histories."

Bejerano, a genomicist who is a member of the Stanford Artificial Intelligence Lab, Child Health Research Institute, Neurosciences Institute, Cancer Institute and Bio-X, is the senior author of a paper describing the research, which will be published Feb. 4 in PLOS Computational Biology. The first author is Harendra Guturu, PhD, a former Stanford who is now a research associate in pediatrics at the university.

Importance of regulatory regions

The researchers focused their analyses on a relatively small proportion of each person's genome—the sequences of regulatory regions that have been faithfully conserved among many species over millions of years of evolution. Proteins called transcription factors bind to regulatory regions to control when, where and how genes are expressed. Some regulatory regions have evolved to generate species-specific differences—for example, mutating in a way that changes the expression of a gene involved in foot anatomy in humans—while other regions have stayed mostly the same for millennia.

"In these cases, evolution has given a clear signal that these regions are important to key biological pathways, and it's important for them to stick around," said Bejerano.

All of us have some natural variation in our genome, accumulated through botched DNA replication, chemical mutation and simple errors that arise when each cell tries to successfully copy 3 billion nucleotides prior to each cell division. When these errors occur in our sperm or egg cells, they are passed to our children and perhaps grandchildren. These variations, called polymorphisms, are usually, but not always, harmless.

GREAT work

Guturu looked for what are called , or SNPs, in the DNA of five people who have made their genomes and information about their own or their family's medical history publicly available for use by researchers worldwide. SNPs are places along a chromosome where the DNA sequence varies from a composite human DNA reference sequence by one letter, or nucleotide.

Rather than search through the whole genome, Guturu focused on SNPs in evolutionarily conserved . Even within these regions, each person had many SNPs. So Guturu used a software program, Predicting Regulatory Information of Single Motifs, developed in the Bejerano lab, to predict which nucleotide changes were likely to disrupt the conserved binding of a transcription factor.

Guturu then turned to software called Genomic Regions Enrichment of Annotations Tool to determine whether the disrupted binding sites were likely to perturb the expression of groups of genes that together control a particular biological function. GREAT, which was also developed in the Bejerano lab, curates knowledge about the diverse functions of thousands of different groups of genes. For any set of genomic regions a user inputs, GREAT determines the most common set or sets of nearby genes.

Using this approach to study the genomes of the five individuals, Guturu, Bejerano and their colleagues found that one of the individuals who had a of sudden cardiac death had a surprising accumulation of variants associated with "abnormal cardiac output"; another with hypertension had variants likely to affect genes involved in circulating sodium levels; and another with narcolepsy had variants affecting parasympathetic nervous system development. In all five cases, GREAT reported results that jibed with what was known about that individual's self-reported medical history, and that were rarely seen in the more than 1,000 other genomes used as controls.

'Exciting avenue for study'

The researchers would like to create a web portal that would allow others to easily conduct similar studies. However, they concede that, for some diseases, the results may not be so clear-cut.

"We are the sum of billions of transcription-factor-binding events in thousands of cell types throughout our bodies," said Bejerano. "Not every disease will be amenable to this type of analysis. But this study shows that nature, even the noncoding genome, can be very benevolent when you ask the right questions. And it may help us begin to combine our knowledge about variations, or mutations, that occur throughout the genome. It's a very exciting avenue for study."

The research is an example of Stanford Medicine's focus on precision health, the goal of which is to anticipate and prevent disease in the healthy and precisely diagnose and treat disease in the ill.

Other Stanford co-authors of the paper are graduate student Sandeep Chinchali and former graduate student Shoa Clarke, MD, PhD.

Bejerano and Guturu have filed a patent application on the algorithm used in this study.

Explore further: Disease-causing regions of the genome that affect gene expression levels are mapped with a new method

Related Stories

Disease-causing regions of the genome that affect gene expression levels are mapped with a new method

January 6, 2016
A new technique for pinpointing the exact DNA regions that impact gene regulation lays the groundwork for identifying new drug targets and for developing diagnostics to predict disease risk, A*STAR scientists report.

A master switch that plays a key role in energy metabolism and human brain evolution

January 26, 2016
Scientists have long used comparative animal studies to better understand the nuances of human evolution, from making diverse body plans to the emergence of entirely powerful and unique features structures, including the ...

3-D map of human genome reveals relationship between mutations and disease development

December 10, 2015
Whitehead Institute researchers have created a map of the DNA loops that comprise the three dimensional (3D) structure of the human genome and regulate gene expression in human embryonic stem (ES) cells and adult cells. The ...

Recommended for you

Researchers identify gene responsible for mesenchymal stem cells' stem-ness'

January 22, 2018
Many doctors, researchers and patients are eager to take advantage of the promise of stem cell therapies to heal damaged tissues and replace dysfunctional cells. Hundreds of ongoing clinical trials are currently delivering ...

Genes contribute to biological motion perception and its covariation with autistic traits

January 22, 2018
Humans can readily perceive and recognize the movements of a living creature, based solely on a few point-lights tracking the motion of the major joints. Such exquisite sensitivity to biological motion (BM) signals is essential ...

Peers' genes may help friends stay in school, new study finds

January 18, 2018
While there's scientific evidence to suggest that your genes have something to do with how far you'll go in school, new research by a team from Stanford and elsewhere says the DNA of your classmates also plays a role.

Two new breast cancer genes emerge from Lynch syndrome gene study

January 18, 2018
Researchers at Columbia University Irving Medical Center and NewYork-Presbyterian have identified two new breast cancer genes. Having one of the genes—MSH6 and PMS2—approximately doubles a woman's risk of developing breast ...

A centuries-old math equation used to solve a modern-day genetics challenge

January 18, 2018
Researchers developed a new mathematical tool to validate and improve methods used by medical professionals to interpret results from clinical genetic tests. The work was published this month in Genetics in Medicine.

Can mice really mirror humans when it comes to cancer?

January 18, 2018
A new Michigan State University study is helping to answer a pressing question among scientists of just how close mice are to people when it comes to researching cancer.

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Feb 05, 2016
Beautiful work on only 5 humans !!
On more humans we will discover that epigenetic and junk non coding DNA are more useful to our life than coding DNA !!

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.