Study suggests a unified model for how DNA is read, offering insight into how genes evolve

Re-learning how to read a genome — New research has revealed that the initial steps of reading DNA are actually remarkably similar at both the genes that encode proteins (here, on the right) and regulatory elements (on the left). The main differences seem to occur after this initial step. Gene messages are long and stable enough to ensure that genes become proteins, whereas regulatory messages are short and unstable, and are rapidly "cleaned up" by the cell. Credit: Adam Siepel, Cold Spring Harbor Laboratory

There are roughly 20,000 genes and thousands of other regulatory "elements" stored within the three billion letters of the human genome. Genes encode information that is used to create proteins, while other genomic elements help regulate the activation of genes, among other tasks. Somehow all of this coded information within our DNA needs to be read by complex molecular machinery and transcribed into messages that can be used by our cells.

Usually, reading a gene is thought to be a lot like reading a sentence. The reading machinery is guided to the start of the gene by various sequences in the DNA - the equivalent of a capital letter - and proceeds from left to right, DNA letter by DNA letter, until it reaches a sequence that forms a punctuation mark at the end. The capital letter and punctuation marks that tell the cell where, when, and how to read a gene are known as regulatory elements.

But scientists have recently discovered that genes aren't the only messages read by the cell. In fact, many regulatory elements themselves are also read and transcribed into messages, the equivalent of pronouncing the words "capital letter," "comma," or "period." Even more surprising, genes are read bi-directionally from so-called "start sites" - in effect, generating messages in both forward and backward directions.

With all these messages, how does the cell know which one encodes the information needed to make a protein? Is there something different about the reading process at genes and regulatory elements that helps avoid confusion? New research, published today in Nature Genetics, has revealed that the initial steps of the reading process itself are actually remarkably similar at both genes and regulatory elements. The main differences seem to occur after this initial step, in the length and stability of the messages. Gene messages are long and stable enough to ensure that genes becomes proteins, whereas regulatory messages are short and unstable, and are rapidly "cleaned up" by the cell.

To make the distinction, the team, which was co-led by CSHL Professor Adam Siepel and Cornell University Professor John Lis, looked for differences between the initial reading processes at genes and a set of regulatory elements called enhancers. "We took advantage of highly sensitive experimental techniques developed in the Lis lab to measure newly made messages in the cell," says Siepel. "It's like having a new, more powerful microscope for observing the process of transcription as it occurs in living cells."

Remarkably, the team found that the reading patterns for enhancer and gene messages are highly similar in many respects, sharing a common architecture. "Our data suggests that the same basic reading process is happening at genes and these non-genic regulatory elements," explains Siepel. "This points to a unified model for how DNA transcription is initiated throughout the genome."

Working together, the biochemists from Lis's laboratory and the computer jockeys from Siepel's group carefully compared the patterns at enhancers and genes, combining their own data with vast public data sets from the NIH's Encyclopedia of DNA Elements (ENCODE) project. "By many different measures, we found that the patterns of transcription initiation are essentially the same at enhancers and genes," says Siepel. "Most RNA messages are rapidly targeted for destruction, but the messages at genes that are read in the right direction - those destined to be a protein - are spared from destruction." The team was able to devise a model to mathematically explain the difference between stable and unstable transcripts, offering insight into what defines a gene. According to Siepel, "Our analysis shows that the 'code' for stability is, in large part, written in the DNA, at enhancers and genes alike."

This work has important implications for the evolutionary origins of new genes, according to Siepel. "Because DNA is read in both directions from any start site, every one of these sites has the potential to generate two protein-coding genes with just a few subtle changes. The genome is full of potential new genes."

More information: "Analysis of transcription start sites from nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers." Leighton Core, André Martins, Charles Danko, Colin Waters, Adam Siepel, and John Lis, Nature Genetics, November 10, 2014: dx.doi.org/10.1038/ng.3142

Journal information: Nature Genetics

Provided by Cold Spring Harbor Laboratory

Study suggests a unified model for how DNA is read, offering insight into how genes evolve

Genetic switches play big role in human evolution

Adiposity in childhood affects the risk of breast cancer by changing breast tissue composition, study suggests

Stem cells provide new insight into genetic pathway of childhood cancer

Congenital anomalies found to be ten times more frequent in children with neurodevelopmental disorders

New research identifies a larger pool of genes involved in age-related blood cell mutations than previously thought

Study reveals mixed public opinion on polygenic embryo screening for IVF

Analysis suggests people with more copies of ribosomal DNA have higher risks of developing disease

Machine learning sheds light on gene transcription

First study to globally map heat-wave-related mortality finds 153,000+ deaths associated with heat waves

Body-wide molecular map explains why exercise is so good for you

Study uncovers protein interactions as a potential path for ALS cure

People without an inner voice have poorer verbal memory, finds study

New research presents 'mini-brains' that could advance Alzheimer's treatment

Study shows virus that causes COVID-19 can penetrate blood-retinal-barrier and could damage vision

Researchers reveal how lipopolysaccharide binding protein resists hepatic oxidative stress

'MUSIC map' reveals some brain cells age faster and are more prevalent in Alzheimer's

Study identifies genetic link between inflammatory bowel disease and Parkinson's disease

Dreaming is linked to improved memory consolidation and emotion regulation

New research challenges widespread beliefs about why we're attracted to certain voices

Donate and enjoy an ad-free experience

Study suggests a unified model for how DNA is read, offering insight into how genes evolve

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Donate and enjoy an ad-free experience

Share article

E-MAIL THE STORY