Can monitoring Wikipedia hits show how many people have the flu? Researchers at Boston Children's Hospital, USA, have developed a method of estimating levels of influenza-like illness in the American population by analysing Internet traffic on specific flu-related Wikipedia articles.
David McIver and John Brownstein's model, publishing in PLOS Computational Biology on April 17th, estimates flu levels in the American population up to two weeks sooner than data from the Centers for Disease Control and Prevention becomes available, and accurately estimates the week of peak influenza activity 17% more often than Google Flu Trends data.
McIver and Brownstein calculated the number of times certain Wikipedia articles were accessed every day from December 2007 to August 2013. The model they developed performed well both through influenza seasons that are more severe than normal and through events such as the H1N1 pandemic in 2009 that received high levels of media attention.
The authors comment: "Each influenza season provides new challenges and uncertainties to both the public as well as the public health community. We're hoping that with this new method of influenza monitoring, we can harness publicly available data to help people get accurate, near-realtime information about the level of disease burden in the population."
Following further validation, the model could be used as an automatic system to model flu levels in the USA, providing support for traditional influenza surveillance tools.
McIver DJ, Brownstein JS (2014) Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time. PLoS Comput Biol 10(4): e1003581. DOI: 10.1371/journal.pcbi.1003581