Largest human exomes data reveals an excess of low frequency non-synonymous coding variants

October 5, 2010

In a paper appearing in Nature Genetics today, an international research group reported the resequencing and analysis of 200 human exomes, established the largest data set for human exomes published so far and reveal an excess of low frequency deleterious non-synonymous genetic mutations. The collabrative team includes investigators from BGI-Shenzhen, UC Berkeley, University of Copenhagen and some other european institutions.

The team used NimbleGen 2.1M exon capture array to targeted capture 18,654 coding genes of human and sequenced 200 individuals from Denmark. The average sequencing depth for each exome is 12X coverage and about 95% of targeted regions were covered by at least 1 read. In total, 121,870 SNPs were identified in the population, about 44% was novel SNPs. 53,081 coding SNPs (cSNPs), 25,275 synonymous and 27,806 non-synonymous, were identified, of which 42.6% were novel.

Based on the large population data, statistical analysis was performed for SNP calling and calculate distribution of allele frequencies. The allele frequency spectrum of cSNPs with a minor allele frequency > 2% was developed to exclude false positive SNPs. By comparing the distribution of allele frequencies among non-synonymous and synonymous cSNPs, a 1.8 fold excess of deleterious, non-syonomyous over synonymous cSNPs was identified in the low allele frequency range between 2-5%. Moreover, this excess was higher for SNPs, suggesting that deleterious mutations on the X chromsome are primarily recessive. The team further analyzed the potential effects of methylation over allele frequencies by comparing the frequency distribution for sites potentially affected by CpG methylation or with unaffected sites, where no strong effect was detected at a genome-wide scale.

The study provides an valuable data set for studying the allele frequency specturm and population genetic patterns, said Dr Yingrui Li, the project investigator from BGI-Shenzhen. We found more low frequency deleterious mutations in coding regions than previously expected, and most of them are recessive, thus we support the idea that much of the heritable variation affecting fitness is caused by low frequency mutations.

Association studies have only detect limited heritable variation associated with common polygenic traits and genotyping analysis generally overlooks the effects of low frequency mutations. The results obtained in this study further demonstrate that exome sequencing is an effective and promising approach to identify genetic variants associated with human traits and study population genetics. The team expects that Future analyses of non-coding regions and ethnically diverse samples will help build a complete picture of human genomic variation and an understanding of the interaction between genetic drift, mutation, recombination, and selection in the human genome.

Previouly, a paper in Science (Science. 2010 July; 329(5987): 75-78) reported sequencing the exomes of 50 Tibetan individuals and found evidence for high altitude adapdation of Tibetan populations. It shows that next generaton sequencing is getting more applications and will have great potential in genomics research, drug discovery and personalized medical treatment.

Related Stories

Recommended for you

Association found between abnormal cerebral connectivity and variability in the PPARG gene in developing preterm infants

December 12, 2017
(Medical Xpress)—A team of researchers with King's College London and the National Institute for Health Research Biomedical Research Centre, both in the U.K., has found what they describe as a strong association between ...

Large genetic study links tendency to undervalue future rewards with ADHD, obesity

December 11, 2017
Researchers at University of California San Diego School of Medicine have found a genetic signature for delay discounting—the tendency to undervalue future rewards—that overlaps with attention-deficit/hyperactivity disorder ...

Gene variants identified that may influence sexual orientation in men and boys

December 8, 2017
(Medical Xpress)—A large team of researchers from several institutions in the U.S. and one each from Australia and the U.K. has found two gene variants that appear to be more prevalent in gay men than straight men, adding ...

Disease caused by reduction of most abundant cellular protein identified

December 8, 2017
An international team of scientists and doctors has identified a new disease that results in low levels of a common protein found inside our cells.

Study finds genetic mutation causes 'vicious cycle' in most common form of amyotrophic lateral sclerosis

December 8, 2017
University of Michigan-led research brings scientists one step closer to understanding the development of neurodegenerative disorders such as ALS.

Mutations in neurons accumulate as we age: The process may explain normal cognitive decline and neurodegeneration

December 7, 2017
Scientists have wondered whether somatic (non-inherited) mutations play a role in aging and brain degeneration, but until recently there was no good technology to test this idea. A study published online today in Science, ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.