Team reveals that human genome could contain up to 20 percent fewer genes

August 31, 2018, The National Centre for Cancer Research
This stylistic diagram shows a gene in relation to the double helix structure of DNA and to a chromosome (right). The chromosome is X-shaped because it is dividing. Introns are regions often found in eukaryote genes that are removed in the splicing process (after the DNA is transcribed into RNA): Only the exons encode the protein. The diagram labels a region of only 55 or so bases as a gene. In reality, most genes are hundreds of times longer. Credit: Thomas Splettstoesser/Wikipedia/CC BY-SA 4.0

A new study led by the Spanish National Cancer Research Centre (CNIO) reveals that up to 20 percent of genes classified as coding (those that produce the proteins that are the building blocks of all living things) may not be coding after all because they have characteristics that are typical of non-coding or pseudogenes (obsolete coding genes). The consequent reduction in the size of the human genome could have important effects in biomedicine, since the number of genes that produce proteins and their identification is of vital importance for the investigation of multiple diseases, including cancer and cardiovascular diseases.

The work, published in the journal Nucleic Acids Research, is the result of an international collaboration led by Michael Tress of the CNIO Bioinformatics Unit along with researchers from the Wellcome Trust Sanger Institute in the United Kingdom, the Massachusetts Institute of Technology in the United States, the Pompeu Fabra University and the National Center for Supercomputing (BSC-CNS) in Barcelona, and the National Center for Cardiovascular Research (CNIC) in Madrid.

Since the completion of the sequencing of the in 2003, experts from around the world have been working to compile the final human (the total number of proteins generated from genes) and the genes that produce them. This task is immense, given the complexity of the human genome and the fact that humans have about 20,000 separate genes.

The researchers analyzed the genes cataloged as protein coding in the main reference human proteomes. The detailed comparison of the reference proteomes from GENCODE/Ensembl, RefSeq and UniProtKB found 22,210 coding genes, but only 19,446 of these genes were present in all 3 annotations.

When they analyzed the 2,764 genes that were present in only one or two of these reference annotations, they were surprised to discover that experimental evidence and manual annotations suggested that almost all of these genes were more likely to be or pseudogenes. In fact, these genes, together with another 1,470 coding genes that are present in the three reference catalogs, were not evolving like typical protein coding genes. The conclusion of the study is that most of these 4,234 genes probably do not code for proteins.

The study is already paying off, according to the scientists. "We have been able to analyze many of these genes in detail," Tress explains, "and more than 300 genes have already been reclassified as non-coding." The results are already being included in the new annotations of the human genome by the GENCODE international consortium, of which the CNIO researchers are part.

Conflicting gene numbers in recent years

The work once again highlights doubts about the number of real genes present in human cells 15 years after the sequencing the human genome. Although the most recent data indicates that the number of genes encoding human proteins could exceed 20,000, Federico Abascal, of the Wellcome Trust Sanger Institute in the United Kingdom and first author of the work, says, "Our evidence suggests that humans may only have 19,000 coding genes, but we still do not know which 19,000 genes are."

For his part, David Juan, of the Pompeu Fabra University and participant in the study, reiterates the importance of these results: "Surprisingly, some of these unusual genes have been well studied and have more than 100 scientific publications based on the assumption that the gene produces a . "

This study suggests that there is still a large amount of uncertainty, since the final number of coding genes could 2,000 more or 2,000 fewer than it is now. The human proteome still requires much work, especially given its importance to the medical community.

Explore further: A new method accelerates the mapping of genes in the 'Dark Matter' of our DNA

More information: Federico Abascal et al. Loose ends: almost one in five human genes still have unresolved coding status, Nucleic Acids Research (2018). DOI: 10.1093/nar/gky587

Related Stories

A new method accelerates the mapping of genes in the 'Dark Matter' of our DNA

November 6, 2017
The information in the sequence of the human genome has a paramount importance in biomedical research. However, the value of this information is very limited in absence of a detailed map of the genes encoded in the genome. ...

Altered gene regulation is more widespread in cancer than expected

July 10, 2018
A large-scale study provides new insights into the mechanisms that can lead to cancer. It can happen when genes mutate, but cancer also can occur when the genetic regions involved in regulating gene expression change. In ...

Team reduces the size of the human genome to 19,000 genes

July 3, 2014
How nutrients are metabolised and how neurons communicate in the brain are just some of the messages coded by the 3 billion letters that make up the human genome. The detection and characterisation of the genes present in ...

Human genome far more active than thought

September 6, 2012
The GENCODE Consortium expects the human genome has twice as many genes than previously thought, many of which might have a role in cellular control and could be important in human disease. This remarkable discovery comes ...

Mapping the genetic controllers in heart disease

July 10, 2018
Researchers have developed a 3-D map of the gene interactions that play a key role in cardiovascular disease, a study in eLife reports.

Recommended for you

Gene editing possible for kidney disease

November 16, 2018
For the first time scientists have identified how to halt kidney disease in a life-limiting genetic condition, which may pave the way for personalised treatment in the future.

Progress in genetic testing of embryos stokes fears of designer babies

November 16, 2018
Recent announcements by two biotechnology companies have stoked fears that designer babies could soon be an option for those who can afford to pick and choose which features they want for their offspring. The companies, MyOme ...

DICE: Immune cell atlas goes live

November 15, 2018
Compare any two people's DNA and you will find millions of points where their genetic codes differ. Now, scientists at La Jolla Institute for Immunology (LJI) are sharing a trove of data that will be critical for deciphering ...

Ashkenazi Jewish founder mutation identified for Leigh Syndrome

November 15, 2018
Over 30 years ago, Marsha and Allen Barnett lost their sons to a puzzling childhood disease that relentlessly attacked their nervous systems and sapped their energy. After five-year-old Chuckie died suddenly in 1981, doctors ...

Drug candidate may recover vocal abilities lost to ADNP syndrome

November 15, 2018
Activity-dependent neuroprotective protein syndrome (ADNP syndrome) is a rare genetic condition that causes developmental delays, intellectual disability and autism spectrum disorder symptoms in thousands of children worldwide. ...

The puzzle of a mutated gene lurking behind many Parkinson's cases

November 15, 2018
Genetic mutations affecting a single gene play an outsized role in Parkinson's disease. The mutations are generally responsible for the mass die-off of a set of dopamine-secreting, or dopaminergic, nerve cells in the brain ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

chemhaznet1
not rated yet Sep 06, 2018
I was wondering why all of those clones of myself kept turning into Gremlins if I fed them after midnight.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.