July 10, 2019

New method helping to find deletions and duplications in the human genome

by Sam Sholtis, Pennsylvania State University

A new machine-learning method accurately identifies regions of the human genome that have been duplicated or deleted—known as copy number variants—that are often associated with autism and other neurodevelopmental disorders. The new method, developed by researchers at Penn State, integrates data from several algorithms that attempt to identify copy number variants from exome-sequencing data—high-throughput DNA sequencing of only the protein-coding regions of the human genome. A paper describing the method, which could help clinicians provide more accurate diagnoses for genetic diseases, appears in the July issue of the journal Genome Research.

"Exome sequencing is fast becoming the gold standard for identifying genetic variations in clinical settings because it is faster and less expensive that other methods," said Santhosh Girirajan, associate professor of biochemistry and molecular biology at Penn State and the lead author of the paper. "However, current algorithms for identifying copy number variation from exome sequencing data suffer from very high false-positive rates—many of the variants they identify aren't actually real. With our new method, called "CN-Learn," around 90 percent of the copy number variants we report are real."

The human genome generally contains two copies of every gene, one on each member of a chromosome pair. When one cell divides into two, the genome is replicated so that each of the daughter cells gets a full complement of genes, but occasionally errors occur during genome replication that, when present in a sperm or egg cell, can lead to an individual getting more or less than two copies of the gene.

To identify copy number variants from exome-sequencing data, researchers look at the relative amount of DNA sequences produced from each gene. If there is only one copy of a gene present in an individual, they expect to see fewer sequencing reads than if there are two copies, and three copies of a gene would lead to more reads. But it's not quite that simple, because a number of other factors can influence how many sequencing reads are produced from each gene. Researchers have therefore developed several algorithms to try to correctly identify copy number variants from exome-sequencing data. Individually, however, these algorithms are not particularly reliable.

"Generally, the high number of false positives from copy-number-variant algorithms has been dealt with by using multiple algorithms and only counting the variants identified by all the methods—like a Venn diagram," said Vijay Kumar Pounraja, a graduate student at Penn State and first author of the paper. "This approach has multiple drawbacks and limitations, so we decided to develop a new machine-learning method instead."

CN-Learn integrates data from four different copy-number-variant algorithms, and uses a small set of biologically validated deletions and duplications to learn the signatures of these genomic events. This learning process is facilitated by a machine-learning algorithm called Random Forest, which uses hundreds of decision trees to model the relationship between the genetic context of deletions and duplications and the likelihood they are validated. CN-Learn then uses this model to predict deletions and duplications in other samples without validations.

"Decisions about a patient's diagnosis and eventual treatment are made based on this information, so it's incredibly important to get them right," said Girirajan. "Because of this, we've made CN-Learn and all of the necessary supporting programs available to download in one easy package."

Journal information: Genome Research

Provided by Pennsylvania State University

Citation: New method helping to find deletions and duplications in the human genome (2019, July 10) retrieved 29 June 2024 from https://medicalxpress.com/news/2019-07-method-deletions-duplications-human-genome.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Disease-associated genes routinely missed in some genetic studies

98 shares

Feedback to editors

Insurance coverage disruptions, challenges accessing care common amid Medicaid unwinding

17 minutes ago

Scientists developing a monoclonal antibody to neutralize Nipah virus one of the deadliest zoonotic pathogens

4 hours ago

Researchers develop scalable synthesis of cancer-fighting compounds

22 hours ago

New device inspired by python teeth may reduce the risk of rotator cuff re-tearing

Jun 28, 2024

Serotonin 2C receptor regulates memory in mice and humans: Implications for Alzheimer's disease

Jun 28, 2024

Fears of attack and no phone signal deter women trail runners, finds study

Jun 28, 2024

Creating supranormal hearing in mice

Jun 28, 2024

Visualizing core pathologies of Parkinson's disease and related disorders in live patients

Jun 28, 2024

Novel mechanism for targeting bone marrow adipocytes to prevent bone loss

Jun 28, 2024

Breakthrough research makes cancer-fighting viral agent more effective

Jun 28, 2024

Load comments (0)

New method helping to find deletions and duplications in the human genome

Insurance coverage disruptions, challenges accessing care common amid Medicaid unwinding

Scientists developing a monoclonal antibody to neutralize Nipah virus one of the deadliest zoonotic pathogens

Researchers develop scalable synthesis of cancer-fighting compounds

New device inspired by python teeth may reduce the risk of rotator cuff re-tearing

Serotonin 2C receptor regulates memory in mice and humans: Implications for Alzheimer's disease

Fears of attack and no phone signal deter women trail runners, finds study

Creating supranormal hearing in mice

Visualizing core pathologies of Parkinson's disease and related disorders in live patients

Novel mechanism for targeting bone marrow adipocytes to prevent bone loss

Breakthrough research makes cancer-fighting viral agent more effective

Disease-associated genes routinely missed in some genetic studies

A thorough characterization of structural variants in human genomes

Researchers use optimized single-cell multi-omics sequencing to better understand colon cancer tumor heterogeneity

The hidden complexity underlying a common cause of autism

Whole genome or exome sequencing: An individual insight

Largest study of its kind finds rare genetic variations linked to schizophrenia

Gene therapy halts progression of rare genetic condition in young boy

Study reveals significant differences in RNA editing between postmortem and living human brain

Combining genomic analyses and outcome data is promising strategy for prostate cancer treatment

Nf1 gene mutations disrupt brain cell plasticity and motor learning in mice

Study identifies high-risk type of childhood acute leukemia and potential treatment strategy

Molecular mapping reveals tissue-specific gene regulation by diabetes-linked transcription factors

Phys.org

Tech Xplore

Science X

New method helping to find deletions and duplications in the human genome

Insurance coverage disruptions, challenges accessing care common amid Medicaid unwinding

Scientists developing a monoclonal antibody to neutralize Nipah virus one of the deadliest zoonotic pathogens

Researchers develop scalable synthesis of cancer-fighting compounds

New device inspired by python teeth may reduce the risk of rotator cuff re-tearing

Serotonin 2C receptor regulates memory in mice and humans: Implications for Alzheimer's disease

Fears of attack and no phone signal deter women trail runners, finds study

Creating supranormal hearing in mice

Visualizing core pathologies of Parkinson's disease and related disorders in live patients

Novel mechanism for targeting bone marrow adipocytes to prevent bone loss

Breakthrough research makes cancer-fighting viral agent more effective

Related Stories

Disease-associated genes routinely missed in some genetic studies

A thorough characterization of structural variants in human genomes

Researchers use optimized single-cell multi-omics sequencing to better understand colon cancer tumor heterogeneity

The hidden complexity underlying a common cause of autism

Whole genome or exome sequencing: An individual insight

Largest study of its kind finds rare genetic variations linked to schizophrenia

Recommended for you

Gene therapy halts progression of rare genetic condition in young boy

Study reveals significant differences in RNA editing between postmortem and living human brain

Combining genomic analyses and outcome data is promising strategy for prostate cancer treatment

Nf1 gene mutations disrupt brain cell plasticity and motor learning in mice

Study identifies high-risk type of childhood acute leukemia and potential treatment strategy

Molecular mapping reveals tissue-specific gene regulation by diabetes-linked transcription factors

Newsletter sign up

Donate and enjoy an ad-free experience