Real-time Change Point Detection using On-line Topic Models

Yunli Wang, Cyril Goutte


Abstract
Detecting changes within an unfolding event in real time from news articles or social media enables to react promptly to serious issues in public safety, public health or natural disasters. In this study, we use on-line Latent Dirichlet Allocation (LDA) to model shifts in topics, and apply on-line change point detection (CPD) algorithms to detect when significant changes happen. We describe an on-line Bayesian change point detection algorithm that we use to detect topic changes from on-line LDA output. Extensive experiments on social media data and news articles show the benefits of on-line LDA versus standard LDA, and of on-line change point detection compared to off-line algorithms. This yields F-scores up to 52% on the detection of significant real-life changes from these document streams.
Anthology ID:
C18-1212
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2505–2515
Language:
URL:
https://aclanthology.org/C18-1212
DOI:
Bibkey:
Cite (ACL):
Yunli Wang and Cyril Goutte. 2018. Real-time Change Point Detection using On-line Topic Models. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2505–2515, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Real-time Change Point Detection using On-line Topic Models (Wang & Goutte, COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1212.pdf