ZIP code or genetic code?
When it comes to disease and health, which is more powerful—zip code or genetic code?
The degree to which nature and nurture affect disease and health remains one of the eternal—and still unanswerable—questions in medicine.
Now a team of investigators from Harvard Medical School and the University of Queensland in Australia have tackled this question in a decidedly novel way.
In what the researchers describe as a coup for big data and a scientific first, the team has used a massive insurance database of nearly 45 million people in the United States including thousands of twin pairs to determine the effects of genes and environment in 560 common conditions. The diseases analyzed span 23 categories, ranging from cardiovascular illness and neuromuscular diseases to skeletal conditions.
The work, published Jan. 14 in Nature Genetics, is thought to provide the largest assessment of U.S. twins to date, the researchers said. It is also the first one to go beyond the traditional one-disease-at-a-time approach and analyze hundreds of the most common conditions among more than 56,000 twin pairs. To date, most twin or familial studies of genes and environment have looked at a single disease or one environmental factor at a time.
Many diseases are neither purely genetic nor purely environmental but rather the result of a complex interplay between the two. Unlike classic inherited conditions—those caused strictly by mutations in a gene or a set of genes—environmentally fueled conditions are the sole result of factors external to an individual's biology. Most diseases do not fall neatly in either category but have elements of both. Disentangling how genes and environment contribute to multiple diseases in the same population has been astoundingly difficult, the researchers said. The new study aims to solve this challenge by developing a new large-scale analytical approach.
"The nurture-versus-nature question is very much at the heart of our study. We foresee the value of this type of large-scale analysis will be in shining light on the relative contribution of genes versus shared environment in a multitude of diseases," said senior study author Chirag Patel, assistant professor of biomedical informatics in the Blavatnik Institute at Harvard Medical School.
The new method, the team said, underscores the value of large-scale analyses in informing national research efforts such as the National Institutes of Health's All of Us program, part of the Precision Medicine Initiative, which aims to tease out biologic, genetic, social and environmental factors in disease and health as a way to inform individualized therapies. The findings of the new study can help direct research efforts by clarifying the relative influence of genetic versus environmental factors for a range of diseases.
"Our findings can provide signposts that inform subsequent research efforts and helps scientists narrowly focus their pursuits," said study first author Chirag Lakhani, a post-doctoral research fellow in biomedical informatics in the Blavatnik Institute at Harvard Medical School. "For example, if our study of twins shows that there is very little heritability effect in a certain family of eye disorders, then future research should pursue alternative explanations."
Using a database of nearly 45 million patient records—including more than 56,000 twin pairs and more than 724,000 sibling pairs—the investigators estimated the influence of genes and environment in fraternal twin pairs—those who share half of their genome, or DNA—and identical twins, whose DNA is 100 percent the same. Same-sex twins can be either identical or fraternal, while opposite-sex twins are always non-identical or fraternal, but the researchers did not know which same-sex pairs were identical. To circumvent this hurdle, they developed a novel statistical method that inferred the probability that a pair of twins is fraternal (non-identical) or identical. In doing so, the researchers were able to separate purely genetic from non-genetic contributions.
All patients had been part of the insurance database for at least 3 years, providing the researchers with more than just a snapshot in time. The newly published study, which involved young twin pairs, newborns to 24 years of age, was not designed to follow disease development over time. This meant the researchers were unable to assess the genetic and environmental influences of certain diseases that tend to develop in middle and older age such as cardiovascular disease and neurodegenerative conditions.
The analysis included variables such as clinical diagnosis, imaging test results, blood chemistry tests such as red and white blood cell counts, cholesterol levels and many others, as well as environmental factors such as air pollution levels, climate conditions and socioeconomic status, all extrapolated from the patients' zip codes.
Nearly 40 percent of the diseases in the study (225 of 560) had a genetic component, while 25 percent (138 of 560) were driven at least in part by factors stemming from a shared living environment—conditions emanating from sharing the same household, social influences and the like. Cognitive disorders demonstrated the greatest degree of heritability—four out of five diseases showing a genetic component—while connective tissue diseases had the lowest degree of genetic influence. Of all disease categories, eye disorders carried the highest degree of environmental influence with 27 out of 42 diseases showing such effect. They were followed by respiratory diseases, with 34 out of 48 conditions showing an effect stemming from sharing the same household. The disease category with lowest environmental influence was reproductive illnesses, with three of 18 conditions showing such effect, and cognitive conditions, with two out of five showing such influence.
Overall, socioeconomic status, climate conditions and air quality of each twin pair's zip code had a far weaker effect on disease than genes and shared environment—a composite measure of external, nongenetic influences including family and lifestyle, household and neighborhood.
In total, 145 of 560 diseases were modestly influenced by socio-economic status derived by zip code. Thirty-six diseases were influenced, at least in part, by air quality, and 117 were affected by changes in temperature. The condition most potently linked to socioeconomic status was morbid obesity. While obesity undoubtedly has a genetic component, the researchers said, the findings raise an important question about the influence of environment on genetic predispositions.
"This finding opens up a whole slew of questions, including whether and how a change in socioeconomic status and lifestyle might compare against genetic predisposition to obesity," Patel said.
Lead poisoning was, not surprisingly, entirely driven by shared environment. Conditions such as flu and Lyme disease were, again unsurprisingly, affected by differences in climate.
When researchers looked at classes of diseases by monthly health care spending, they found that both genes and environment significantly contributed to cost of care with the two being nearly equal drivers of spending. Nearly 60 percent of monthly health spending could be predicted by analyzing genetic and environmental factors.
Large-scale analysis like this study can help forecast long-term spending for various conditions and inform resource allocation and policy decisions, the researchers said.
Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes, Nature Genetics (2019). DOI: 10.1038/s41588-018-0313-7 , https://www.nature.com/articles/s41588-018-0313-7