New algorithm can pinpoint mutations favored by natural selection in large sections of the human genome

February 20, 2018, University of California - San Diego
It is hypothesized that natural selection favors lighter skin in northern latitudes to compensate for vitamin D deficiency due to lower UV radiation. iSAFE identified identical mutations in multiple non-African populations in 5 regions associated with skin pigmentation, suggesting an early response to the onset of selection as humans migrated out of Africa. Blue is derived and red is ancestral. Credit: University of California San Diego

A team of scientists has developed an algorithm that can accurately pinpoint, in large regions of the human genome, mutations favored by natural selection. The finding provides deeper insight into how evolution works, and ultimately could lead to better treatments for genetic disorders. For example, adaptation to chronic hypoxia at high altitude can suggest targets for cardiovascular and other ischemic diseases.

The sequenced genome of a single individual yields about half a terabyte of data of information—that's about as much information as you'll find on 106 DVDs. A population sample of size 1000 individuals contains 1000 times as much information. So to examine such a massive amount of data, researchers turned to computational techniques.

"Computer science and data science are playing a significant role to better understand the code of life and uncover the hidden patterns in our genome," said Ali Akbari, the paper's first author and a Ph.D. student in electrical and computer engineering at the University of California San Diego. "We are analyzing massively large sets of human genomic data to ultimately improve our understanding of genetic basis of diseases."

Researchers detail the algorithm, dubbed iSAFE, in the Feb. 19 issue of Nature Methods.

Many existing genomic analysis approaches can detect which regions of the are evolving under selection pressure. Often, these regions are large, covering millions of base-pairs and do not shed light on the specific that are responding to the selection pressure. iSAFE doesn't need to know the function of the genomic region it is analyzing or any demographic information for the human population it belongs to. Instead, the researchers used population genetic signals imprinted in the genomes of the sampled individuals and machine learning techniques to reliably identify the mutation favored by selection.

New algorithm can pinpoint mutations favored by natural selection in large sections of the human genome
Credit: Art by Beata Mierzwa,

In , neighboring mutations 'hitchhike' with the mutation that is under positive selection, leading to a loss of genetic diversity near the favored mutation. iSAFE exploits signals in the neighboring sequences, the so-called "shoulder regions" to pinpoint the favored mutation.

"Finding the favored mutation among tens of thousands of other, hitchhiking, mutations was like a needle in a haystack problem," said Akbari, who works in the research group of computer science professor Vineet Bafna at the Jacobs School of Engineering at UC San Diego.

To test the algorithm, researchers ran iSAFE on regions of the that are home to known favored mutations. The algorithm ranked the correct mutation as the top one out of more than 21,000 possibilities in 69 percent of cases, as opposed to state of the art methods, which only did this in 10 percent of cases.

The algorithm also identified a host of previously unknown mutations, including five that involve genes related to pigmentation. In these cases, iSAFE identified identical mutations in multiple non-African populations. This suggests an early response to the onset of selection as humans migrated out of Africa.

Explore further: Cells rank genes by importance to protect them, according to new research

More information: Ali Akbari et al. Identifying the favored mutation in a positive selective sweep, Nature Methods (2018). DOI: 10.1038/NMETH.4606

Related Stories

Cells rank genes by importance to protect them, according to new research

January 5, 2018
Researchers at the University of Oxford have discovered that a cellular mechanism preferentially protects plant genes from the damaging effects of mutation.

Exploring the adaptation extremes of human high altitude sickness and fitness

September 21, 2017
Many research groups have recently explored human adaptation and successfully identified candidate genes to high altitude living among three major far-flung global populations: Tibetans, Ethiopians and Peruvians.

These mutations could be key to understanding how some harmful conditions develop

September 11, 2017
A team of researchers led by a bioinformatician at the University of California San Diego has developed a method to help determine whether certain hard-to-study mutations in the human genome, called short tandem repeats or ...

New technique searches 'dark genome' for disease mutations

August 10, 2017
When doctors can't find a diagnosis for patient's disease, they turn to genetic detectives. Equipped with genomic sequencing technologies available for less than 10 years, these sleuths now routinely search through a patient's ...

Recommended for you

New methods find undiagnosed genetic diseases in electronic health records

March 15, 2018
Patients diagnosed with heart failure, stroke, infertility and kidney failure could actually be suffering from rare and undiagnosed genetic diseases.

Hundreds of genes linked to intelligence in global study

March 14, 2018
More than 500 genes linked to intelligence have been identified in the largest study of its kind. Scientists compared variation in DNA in more than 240,000 people from around the world, to discover which genes are associated ...

Study finds that genes play a role in empathy

March 12, 2018
A new study published today suggests that how empathic we are is not just a result of our upbringing and experience but also partly a result of our genes.

Large-scale genetic study provides new insight into the causes of stroke

March 12, 2018
An international research consortium studying 520,000 individuals from around the world has identified 22 new genetic risk factors for stroke, thus tripling the number of gene regions known to affect stroke risk. The results ...

Study suggests some CpGs in the genome can be hemimethylated by design

March 9, 2018
A pair of researchers at Emory University has found that some CpGs in the genome can be hemimethylated by design, rather than by chance. In their paper published in the journal Science, Chenhuan Xu and Victor Corces describe ...

Intravenous arginine benefits children after acute metabolic strokes

March 9, 2018
Children with mitochondrial diseases who suffered acute metabolic strokes benefited from rapid intravenous treatment with the amino acid arginine, experiencing no side effects from the treatment. The diseases were caused ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.