Novel model predicts COVID-19 outbreak two weeks ahead of time
People's social behavior, reflected in their mobility data, is providing scientists with a way to forecast the spread of COVID-19 nationwide at the county level. Researchers from Florida Atlantic University's College of Engineering and Computer Science and collaborators have developed the first data-driven deep learning model with the potential to predict an outbreak in COVID-19 cases two weeks in advance. Findings from this study have important implications for managing the current pandemic as well as future pandemics.
For the study, published in the Journal of Big Data, researchers integrated driving-mobility data collected by Apple Maps App, COVID-19 statistics and county-level demographics from 531 counties in the United States. They trained their long short-term memory (LSTM) deep learning model to capture the effect of government responses on COVID-19 cases as well as the effect of age on the spread of COVID-19.
"Data-driven models can learn from the history of the disease. For example, they can use mobility data such as transportation and walking, which provides a near-real-time change in movement patterns, to learn the effect of social behavior on the reproduction rate," said Behnaz Ghoraani, Ph.D., senior author, an associate professor in the Department of Electrical Engineering and Computer Science; and a fellow of the FAU Institute for Sensing and Embedded Network Systems Engineering (I-SENSE). "An increase in mobility shows an increase in the interaction between people, especially in areas with high population density. Therefore, feeding the mobility data to epidemiological forecasting models helps to estimate COVID-19 growth as well as evaluating the effects of government policies such as mandating masks on the spread of COVID-19."
For the study, researchers explored three age demographics: young adults, adults and retirees. For each age population, they identified the counties with people greater than a percentage. For example, for each age population, they identified the counties where 10 percent of their population is young and calculated the average daily cases. They increased the threshold by 10 percent until 70 percent and repeated the analysis.
Results showed that average daily cases decreased with an increase in the retiree percentage and increased with the young population percentage increase. Average daily cases doubled when the young population increased from 10 to 20 percent and tripled when increased to 30 percent. The inverse pattern happened with the increase in the percentage of retirees.
Researchers performed a detailed analysis to validate that the predictions from their model reflected the same patterns in the actual cases with respect to the changes in the government pandemic regulations and counties' age demographics.
"Change in lockdown policies, mask mandates, and other government responses directly impact daily COVID-19 cases. Hence, the model predictions of the two-week daily cases have to reflect that impact as shown by the actual accumulated two-week cases," said Borko Furht, Ph.D., co-author, professor, Department of Computer and Electrical Engineering and Computer Science, and director of the National Science Foundation (NSF) Industry/University Cooperative Research Center for Advanced Knowledge Enablement (CAKE). "Predicting spread at the county level accounts for the influence of low-level local policies and will help to provide better forecasting to support national and state predictions. For example, short-term predictions of the accumulated cases can be used to plan and decide whether or not a lockdown is necessary."
Researchers retrieved dynamic data from Feb. 15, 2020 to Jan. 22, and filtered the counties with a population density of fewer than 150 people per square mile. They trained the LSTM model to learn how the past and the current number of cases and people's mobility impact future cases. They used the data of 424 counties over 168 days for training and 107 counties over 168 days for validation. The model resulted in a significant correlation when tested on the interval from Aug. 1, 2020 until Jan. 22. It was able to predict an increase and also a decrease in the total number of cases.
"The deep learning model developed by our researchers is especially relevant now as cases of the Delta variant escalate in Florida and throughout the nation," said Stella Batalama, Ph.D., dean, College of Engineering and Computer Science. "Many infected populations remain asymptomatic while spreading the virus, making it challenging for traditional mechanistic models to predict an upcoming outbreak accurately. The work by professors Ghoraani and Furht and our collaborators from Lexis-Nexis Risk Solutions has significant applications for effective management of the pandemic and future outbreaks, which has the potential to save lives and keep our economies thriving."