Study unlocks trove of public health data to help fight deadly contagious diseases

In an unprecedented windfall for public access to health data, University of Pittsburgh Graduate School of Public Health researchers have collected and digitized all weekly surveillance reports for reportable diseases in the United States going back more than 125 years.

The easily searchable database, described in the Nov. 28 issue of the New England Journal of Medicine, is free and publicly available. Supported by the Bill & Melinda Gates Foundation and the National Institutes of Health (NIH), the project's goal is to aid scientists and officials in the eradication of deadly and devastating diseases.

"Using this database, we estimate that more than 100 million cases of serious childhood contagious diseases have been prevented, thanks to the introduction of vaccines," said lead author Willem G. van Panhuis, M.D., Ph.D., assistant professor of epidemiology at Pitt Public Health. "But we also are able to see a resurgence of some of these diseases in the past several decades as people forget how devastating they can be and start refusing vaccines."

Despite the availability of a pertussis vaccine since the 1920s, the largest pertussis epidemic in the U.S. since 1959 occurred last year. Measles, mumps and rubella outbreaks also have reoccurred since the early 1980s.

"Analyzing historical epidemiological data can reveal patterns that help us understand how spread and what interventions have been most effective," said Irene Eckstrand, Ph.D., of NIH, which partially funded the research through its Models of Infectious Disease Agent Study. "This new work shows the value of using computational methods to study historical data—in this case, to show the impact of vaccination in reducing the burden of infectious diseases over the past century."

"We are very excited about the release of the database," said Steven Buchsbaum, deputy director, Discovery and Translational Sciences, for the Bill & Melinda Gates Foundation. "We anticipate this will not only prove to be an invaluable tool permitting researchers around the globe to develop, test and validate epidemiological models, but also has the potential to serve as a model for how other organizations could make similar sets of critical public more broadly, publicly available."

The digitized dataset is dubbed Project TychoTM, for 16th century Danish nobleman Tycho Brahe, whose meticulous astronomical observations enabled Johannes Kepler to derive the laws of planetary motion.

"Tycho Brahe's data were essential to Kepler's discovery of the laws of planetary motion," said senior author Donald S. Burke, M.D., Pitt Public Health dean and UPMC-Jonas Salk Chair of Global Health. "Similarly, we hope that our Project Tycho disease database will help spur new, life-saving research on patterns of epidemic infectious disease and the effects of vaccines. Open access to disease surveillance records should be standard practice, and we are working to establish this as the norm worldwide."

The researchers selected eight vaccine-preventable contagious diseases for a more detailed analysis: smallpox, polio, measles, rubella, mumps, hepatitis A, diphtheria and pertussis. By overlaying the reported outbreaks with the year of vaccine licensure, the researchers are able to give a clear, visual representation of the effect that vaccines have in controlling .

"Infectious disease research is critically dependent on reliable historical data to understand underlying epidemic dynamics. However, my colleagues and I repeatedly find ourselves digging out historical datasets from various sources in different states of preservation," said Dr. van Panhuis. "By digitizing and giving open access to the entire collection of U.S. notifiable disease data, we've made a bold move toward solving this problem."

The researchers obtained all weekly notifiable disease surveillance tables published between 1888 and 2013—approximately 6,500 tables—in various historical reports, including the U.S. Centers for Disease Control and Prevention's Morbidity and Mortality Weekly Report. These tables were available only in paper format or as PDF scans in online repositories that could not be read by computers and had to be hand-entered. With an estimated 200 million keystrokes, the data—including death counts, reporting locations, time periods and diseases—were digitized. A total of 56 diseases were reported for at least some period of time during the 125-year time span, with no single disease reported continuously.

"This work by the Tycho Team is remarkable and represents the next step in making government data accessible and useful," said Bryan Sivak, U.S. Department of Health and Human Services chief technology officer and entrepreneur in residence. "This is a great example of how our policies on open data and accelerate the use of computer-readable data by researchers and application developers to create new tools and provide valuable insights into the nation's health."

All these data now can be explored and retrieved by everyone on the Project Tycho Web site http://www.tycho.pitt.edu. The open access release of these data has ignited a collaboration with the United States Open Government Initiative and, in the near future, the Project Tycho database will be available on the HealthData.gov Web pages.

"Historical records are a precious yet undervalued resource. As Danish philosopher Soren Kierkegaard said, we live forward but understand backward," explained Dr. Burke. "By 'rescuing' these historical disease data and combining them into a single, open-access, computable system, we now can better understand the devastating impact of epidemic diseases, and the remarkable value of vaccines in preventing illness and death."

Related Stories

NIH launches first phase of microbiome cloud project

Sep 26, 2013

The National Institutes of Health (NIH) has launched the first phase of the Microbiome Cloud Project (MCP), a collaboration with Amazon Web Services that aims to improve access to and analysis of data from the Human Microbiome P ...

Recommended for you

NY and NJ say they will require Ebola quarantines

10 hours ago

The governors of New Jersey and New York on Friday ordered a mandatory, 21-day quarantine for all doctors and other arriving travelers who have had contact with Ebola victims in West Africa.

WHO: Mali case may have infected many people

14 hours ago

The World Health Organization says a toddler who brought Ebola to Mali was bleeding from her nose during her journey on public transport and may have infected many people.

Two US nurses are declared cured of Ebola

15 hours ago

Two American nurses were declared cured of Ebola on Friday, and one was healthy enough to leave the hospital and meet President Barack Obama for a hug.

User comments

Adjust slider to filter visible comments by rank

Display comments: newest first

sportnut
1 / 5 (1) Nov 27, 2013
There's two big factors for a logical reason to avoid innoculations.... one is that you're fooling around with the immune system, something very sensitive... it has to be able to distinguish between my healthy cells, and my cancerous cells... do you really want to mess with that... and two... alot of people are getting autoimmune diseases which are worse than most of these communicable diseases.... why hype up and confuse the immune system with innoculations... most people have a great immune system... maybe only those with poor immune systems, people who get sick alot, should get innoculations... I'm a nurse, and a medical illustrator... I've seen alot, I've studied alot...
maxxxxpower
not rated yet Nov 28, 2013
There is no other way to put this, sportnut, other than you are completely wrong. I will take the word of scientists, doctors and decades of research vs a nurse with what seems like a poor understanding of how the immune system works.
Sagi
not rated yet Dec 02, 2013
There are many types of diseases but very commonly http://www.bedwet...rapy.com bedwetting in children and even in adults are found. To treat this diseases visit Dr. Sagie Bedwetting Clinics