Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction

Hayate Iso, Shoko Wakamiya, Eiji Aramaki


Abstract
Because of the increasing popularity of social media, much information has been shared on the internet, enabling social media users to understand various real world events. Particularly, social media-based infectious disease surveillance has attracted increasing attention. In this work, we specifically examine influenza: a common topic of communication on social media. The fundamental theory of this work is that several words, such as symptom words (fever, headache, etc.), appear in advance of flu epidemic occurrence. Consequently, past word occurrence can contribute to estimation of the number of current patients. To employ such forecasting words, one can first estimate the optimal time lag for each word based on their cross correlation. Then one can build a linear model consisting of word frequencies at different time points for nowcasting and for forecasting influenza epidemics. Experimentally obtained results (using 7.7 million tweets of August 2012 – January 2016), the proposed model achieved the best nowcasting performance to date (correlation ratio 0.93) and practically sufficient forecasting performance (correlation ratio 0.91 in 1-week future prediction, and correlation ratio 0.77 in 3-weeks future prediction). This report is the first of the relevant literature to describe a model enabling prediction of future epidemics using Twitter.
Anthology ID:
C16-1008
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
76–86
Language:
URL:
https://aclanthology.org/C16-1008
DOI:
Bibkey:
Cite (ACL):
Hayate Iso, Shoko Wakamiya, and Eiji Aramaki. 2016. Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 76–86, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction (Iso et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1008.pdf
Presentation:
 C16-1008.Presentation.pdf