Overview of the SNV-sets, and SKAT tests performed, as well as overlap between the results. a Overview of the SNV-sets, SKAT tests performed, and overlap between the results for the different SNV-sets. The Venn diagram shows the number of overlapping loci with any significant SKAT association between the different SNV-sets. For Cis-associations a p-value of 5.88 × 10–6 was considered as threshold for significance, while for Trans-associations a threshold p-value of 4.67 × 10–10 was adopted. b Fraction of loci identified in the different models within each SNV-sets. A total of 198, 190, 182, 33, and 27 loci were identified with the five SNV-sets, respectively (N in the legend). Each bar represents the fraction of these N loci that were significant for the different SKAT models. The seven models are: (1) Unweighted; (2) CADD or Eigen weighted; (3) MAF weighted, β(1, 25); (4) MAF weighted, β(1, 5); (5) MAF weighted, β(0.5, 0.5); (6) CommonRare; (7) Rare only. Credit: Nature Communications (2022). DOI: 10.1038/s41467-022-30208-8

Although some rare genetic variants can increase the risk of disease markedly for a few individuals, the genetic contribution to common diseases is mostly due to a combination of many common genetic variants with small effects. This is shown in a comprehensive study by researchers at Uppsala University and SciLifeLab, published in the journal Nature Communications.

It is known that , together with lifestyle and environment, contribute to each individual's vulnerability to common non-communicable diseases, such as cardiovascular diseases, and cancers. During the last 15 years, researchers in genetic epidemiology have successfully identified genes that contribute to heritability, i.e. the degree to which a given trait is inherited from parents to offspring through our genes. However, a significant fraction of the heritability has not yet been explained by the genetic variants identified. Until recently, high-throughput technologies for genetic analyses have been limited to a selection of genetic variants that are informative in any given .   

However, in recent years, novel DNA sequencing technologies that enable researchers to study each individual position in the human genome have become available. It has been shown that a vast majority of the genetic variants are very rare, and sometimes even specific to a population. It is therefore plausible that previous genetic studies have overlooked a majority of the disease-causing genetic effects.

In the current study, the scientists used high-throughput next generation sequencing to characterize the genetic variation in a Swedish cohort of over 1,000 participants and linked the genetic variation to functional consequences that are mediated by proteins, the gene products.

"Proteins, the products of our genes, mediate the effects of our genes on disease risk. Therefore, characterizing the link between variation at genetic and level is of great importance to understanding how genetic variation causes diseases," says lead scientist Åsa Johansson, Docent at the Department of Immunology, Genetics and Pathology at Uppsala University and SciLifeLab.

Over four hundred proteins were targeted in the current study and the scientists showed that many proteins were influenced by . It was evident that the rare mutations often have larger phenotypic effects on the proteome compared to common variants. However, precisely because they are so uncommon, rare variants do not appear to explain very much of the heritability. The results were also supported by theoretical computations and challenge some of the hypotheses that have existed in the field for some time.

"Surprisingly, even using statistical models developed to capture the effects of rare variants, very few associations were identified, in contrast to the associations for common variants," says Marcin Kierczak, Docent at the Department of Cell and Molecular Biology at Uppsala University and bioinformatics expert at the National Bioinformatics Infrastructure, Sweden, who implemented the bioinformatics pipeline used in the analyses.

This suggests that the major component of the heritability of the proteome as well as downstream disease are due to a much larger degree to common variation than to rare variants. However, the heritability is a measure of the total burden of genetic contributions to a disease in the population, and the effect of rare variants at individual level could still be high.

"Even if our results showed that in the general population the major burden of is due to common variants, there are still individuals with rare variants that dramatically influence their risk of disease," says Johansson.

"It is therefore important to highlight the usefulness of high-throughput sequencing technologies to increase our ability to identify those individuals who have a pronounced genetic risk of diseases and could be suitable for precision medicine interventions," says Valeria Lo Faro, Research Assistant at the Department of Immunology, Genetics and Pathology at Uppsala University and SciLifeLab.

More information: Marcin Kierczak et al, Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability, Nature Communications (2022). DOI: 10.1038/s41467-022-30208-8

Provided by Uppsala University