Using Twitter to track the flu: Researchers find a better way to screen the tweets

January 25, 2013

Sifting through social media messages has become a popular way to track when and where flu cases occur, but a key hurdle hampers the process: how to identify flu-infection tweets. Some tweets are posted by people who have been sick with the virus, while others come from folks who are merely talking about the illness. If you are tracking actual flu cases, such conversations about the flu in general can skew the results.

To address this problem, Johns Hopkins and researchers in the School of Medicine have developed a new tweet- that not only delivers real-time data on , but also filters out online chatter that is not linked to actual . Comparing their method, which is based on analysis of 5,000 publicly available tweets per minute, to other Twitter-based tracking tools, the Johns Hopkins researchers say their real-time results track more closely with government disease data that takes much longer to compile.

"When you look at Twitter posts, you can see people talking about being afraid of catching the flu or asking friends if they should get a or mentioning a public figure who seems to be ill," said Mark Dredze, an assistant research professor in the Department of Computer Science who uses tweets to monitor public health trends. "But posts like this don't measure how many people have actually contracted the flu. We wanted to separate hype about the flu from messages from people who truly become ill."

The video will load shortly
A video produced by Twitter about Johns Hopkins’ use of tweets to track public health trends.

Dredze, who also is a research scientist at the Johns Hopkins Human Language Technology Center of Excellence, led a team that in mid-2011 released one of the first and most comprehensive studies showing that Twitter data can yield useful public health information. Since then, this strategy has become so popular that the U.S. Department of Health and Human Services last summer sponsored a contest challenging researchers to design an online application that could track major .

This winter, as the United States entered an unusually severe and early flu season, Twitter-based flu projections have drawn increasing attention. Many public tweets, such as, "I'm so sick this week with the flu," can indicate a rise in the flu rate. Collecting enough of these tweets can help health officials gauge the scope and severity of an epidemic.

But the reliability of many computer models can be weakened by too many tweets that point to flu-related news reports and other matters not directly linked to a flu case, according to David Broniatowski, a School of Medicine postdoctoral fellow in the Department of Emergency Medicine's Center for Advanced Modeling in the Social, Behavioral, and Health Sciences. "For example," he said, "a recent spike in Twitter flu activity was caused by discussions about basketball legend Kobe Bryant's flu-like symptoms during a recent game. Mr. Bryant's health notwithstanding, such tweets do very little to help public health officials prepare our nation for the next big outbreak."

To improve their accuracy when using tweets to track the flu, the John Hopkins team developed sophisticated statistical methods based on human language processing technologies. The methods are designed to filter out the chatter. The system can distinguish, for example, between "I have the flu" and "I'm worried about getting the flu."

Another advantage of the Johns Hopkins flu projection method is that it can produce real-time results. By comparison, the U.S. Centers for Disease Control and Prevention, which record flu-related symptoms from hospital visits, typically take two weeks to publish data on the flu's prevalence.

To check the reliability of their enhanced system, the Johns Hopkins researchers recently compared their results to CDC data for the same period. The researchers said that during November and December 2012, their system demonstrated a substantial improvement in tracking with CDC figures as compared to previous Twitter-based tracking methods. "In late December," Dredze added, "the news media picked up on the flu epidemic, causing a somewhat spurious rise in the rate produced by our Twitter system. But our new algorithm handles this effect much better than other systems, ignoring the spurious spike in tweets."

The researchers have also used their Twitter data to produce United States maps that document the stark differences between last year's mild flu season and the much higher incidence of the virus in the winter of 2012-2013.

While their new method was only recently developed, the Johns Hopkins researchers chose to release information on the flu tracking system because of the higher incidence of illness this winter. Team members hope to share the enhanced flu tracking method with leading government health agencies.

"This new work demonstrates that Twitter posts can be used to guide public health officials in their response to outbreaks of infectious diseases," Dredze said. "Our hope is that the new technology can be used track other diseases as well."

Other Johns Hopkins researchers participating in the Twitter flu project are doctoral student Michael Paul and recent bachelor's degree graduate Alex Lamb, both in the Department of Computer Science.

The Johns Hopkins researchers noted that their enhanced Twitter analysis system looked only at public tweets in which all user names and gender information had been removed. The system was tested only on messages from the United States. The research was funded in part by the National Institutes of Health's Models of Infectious Disease Agent Study.

Explore further: You are what you tweet: Tracking public health trends with Twitter

Related Stories

You are what you tweet: Tracking public health trends with Twitter

July 6, 2011
Twitter allows millions of social media fans to comment in 140 characters or less on just about anything: an actor's outlandish behavior, an earthquake's tragic toll or the great taste of a grilled cheese sandwich.

Flu season off to latest start in decades

February 17, 2012
(AP) -- Health officials say the flu season is finally here - the slowest start in nearly 25 years.

Flu remains widespread in US; eases in some areas

January 18, 2013
Health officials say nine more deaths of children from the flu have been reported, bringing the total this flu season to 29.

US flu season starts early, could be bad, CDC says

December 3, 2012
Health officials say flu season is off to its earliest start in nearly 10 years—and it could be a bad one.

Flu watchers tap social media might

January 18, 2013
Dr. Andrea Dugas recalled widespread skepticism at a medical conference a few years ago when a colleague suggested that social media mentions and search volume could one day forecast flu activity.

Recommended for you

Google searches can be used to track dengue in underdeveloped countries

July 20, 2017
An analytical tool that combines Google search data with government-provided clinical data can quickly and accurately track dengue fever in less-developed countries, according to new research published in PLOS Computational ...

MRSA emerged years before methicillin was even discovered

July 19, 2017
Methicillin resistant Staphylococcus aureus (MRSA) emerged long before the introduction of the antibiotic methicillin into clinical practice, according to a study published in the open access journal Genome Biology. It was ...

New test distinguishes Zika from similar viral infections

July 18, 2017
A new test is the best-to-date in differentiating Zika virus infections from infections caused by similar viruses. The antibody-based assay, developed by researchers at UC Berkeley and Humabs BioMed, a private biotechnology ...

'Superbugs' study reveals complex picture of E. coli bloodstream infections

July 18, 2017
The first large-scale genetic study of Escherichia coli (E. coli) cultured from patients with bloodstream infections in England showed that drug resistant 'superbugs' are not always out-competing other strains. Research by ...

Ebola virus can persist in monkeys that survived disease, even after symptoms disappear

July 17, 2017
Ebola virus infection can be detected in rhesus monkeys that survive the disease and no longer show symptoms, according to research published by Army scientists in today's online edition of the journal Nature Microbiology. ...

Mountain gorillas have herpes virus similar to that found in humans

July 13, 2017
Scientists from the University of California, Davis, have detected a herpes virus in wild mountain gorillas that is very similar to the Epstein-Barr virus in humans, according to a study published today in the journal Scientific ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.