SEDTWik: Segmentation-based Event Detection from Tweets Using Wikipedia

Keval Morabia, Neti Lalita Bhanu Murthy, Aruna Malapati, Surender Samant


Abstract
Event Detection has been one of the research areas in Text Mining that has attracted attention during this decade due to the widespread availability of social media data specifically twitter data. Twitter has become a major source for information about real-world events because of the use of hashtags and the small word limit of Twitter that ensures concise presentation of events. Previous works on event detection from tweets are either applicable to detect localized events or breaking news only or miss out on many important events. This paper presents the problems associated with event detection from tweets and a tweet-segmentation based system for event detection called SEDTWik, an extension to a previous work, that is able to detect newsworthy events occurring at different locations of the world from a wide range of categories. The main idea is to split each tweet and hash-tag into segments, extract bursty segments, cluster them, and summarize them. We evaluated our results on the well-known Events2012 corpus and achieved state-of-the-art results. Keywords: Event detection, Twitter, Social Media, Microblogging, Tweet segmentation, Text Mining, Wikipedia, Hashtag.
Anthology ID:
N19-3011
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Sudipta Kar, Farah Nadeem, Laura Burdick, Greg Durrett, Na-Rae Han
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
77–85
Language:
URL:
https://aclanthology.org/N19-3011/
DOI:
10.18653/v1/N19-3011
Bibkey:
Cite (ACL):
Keval Morabia, Neti Lalita Bhanu Murthy, Aruna Malapati, and Surender Samant. 2019. SEDTWik: Segmentation-based Event Detection from Tweets Using Wikipedia. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 77–85, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
SEDTWik: Segmentation-based Event Detection from Tweets Using Wikipedia (Morabia et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-3011.pdf
Code
 kevalmorabia97/SEDTWik-Event-Detection-from-Tweets