Tracking the flu with data
The Centers for Disease Control and Prevention recently declared a flu epidemic in the U.S., with the virus appearing in 46 states so far. Many people have stayed home sick, while officials have announced that this year's vaccine is not as effective as in years past. Alessandro Vespignani—a world-renowned statistical physicist and the Sternberg Distinguished Professor of Physics who holds joint appointments in the College of Science, the College of Computer and Information Science, and the Bouvé College of Health Sciences at Northeastern—and his team in the university's Laboratory for the Modeling of Biological and Socio-Technical Systems are utilizing large amounts of data to model the spread of the virus and predict when the outbreak will begin to taper off.
Here, Vespignani discusses the science behind his predictions and what they say about the future of this year's flu season in Massachusetts and beyond.
The CDC has declared a national flu epidemic. What's your assessment of how widespread the flu has become in the U.S. this season, and the likelihood that it will continue to grow at the current rate or faster? And what might be the impact in Massachusetts, specifically?
The CDC data reports widespread activity in most of the U.S. Also, the intensity of the epidemic is remarkable, retracing the nasty season of 2012-13. However, the most recent data and forecast models are telling us that we are going through the peak right now, and that the activity will likely start decreasing in most of the U.S. This does not mean that we are "out of the woods" yet. Being at the peak of the season means we are just halfway through it. We therefore have to consider several more weeks of sustained flu activity. The usual recommendations about getting the flu shot and not going to work if you feel ill still apply in full.
Concerning Massachusetts, the flu season is seemingly following the national trend with a little delay. However, our region had a very "bumpy" 2013-14, with multiple peaks and irregular activity. Hopefully this year does not have too many surprises in store for us.
You and your colleagues around the globe recently created a tool that allows people to visually explore the flu data in several countries and from a variety of sources, including the CDC. How does this tool work and how can people use it?
We have set up a computational platform, Fluoutlook.org, that allows people to follow the flu season by looking at the real-time data released by the various national flu surveillance systems and by exploring several different forecasting algorithms that project the evolution of the epidemic up to four weeks in advance. The algorithms we use for the forecast span a wide range of techniques, including dynamic generative models that take into account the geographical regions within each specific country and infer the specific epidemic parameters of the season, such as the virus transmissibility. We are considering more than half a dozen countries, including the U.S. and Canada, but we aim at expanding the platform by progressively adding new countries, models, and data. We are also opening the platform to other modeling groups and hope to aggregate more forecasting systems in it in the near future. The aim is to provide a real-time tool with which users can explore data, collect situational awareness, investigate trends, and look at forecasts generally available only to a small number of practitioners in the field. Because we're operating in real time, we update the platform weekly and issue new forecasts concurrently with any new dataset originated from the surveillance systems. Reliable flu forecasts are still a scientific problem, and we hope that this platform will help in testing, comparing, and evaluating different techniques in different countries.
In addition to the flu, your lab has produced groundbreaking research on predicting the spread of the Ebola virus and other diseases. How do you go about creating these forecasts, and does predicting the flu present any significant challenges in particular?
We go after epidemics by developing large-scale computational epidemic models that integrate socio-demographic and mobility data of the population under study. These models are detailed down to the individual level and provide the dynamic of the epidemic by simulating the infection transmission event in the computer for millions of individuals in their social and geographical settings. In a nutshell, what we do is akin to what is done with computerized weather forecasts. The difference is that the data, model, and algorithms we use are describing the individuals and the biological processes underlying the spread of the disease instead of the physical processes of the meteorological systems.
The flu, although it is a seasonal disease that we know very well, is very elusive from a modeling perspective. It does not have a definite geographical initial condition. The dominant flu strain changes every year, and typically there are several co-circulating strains. These are some of the reasons why we do not have reliable forecasting systems in place—yet. Tools like our Fluoutlook.org are the first attempt, and not the final solution, to solving the problem of real-time epidemic forecasting. Indeed, Fluoutlook.org is an effort that we will continue to support so that the analysis of models, their eventual improvements, and their reliability can be evaluated over the span of several years and in a wide range of geographical and social contexts. There is a lot of work still out there waiting to be done.