Event Extraction in Video Transcripts

Amir Pouran Ben Veyseh, Viet Dac Lai, Franck Dernoncourt, Thien Huu Nguyen


Abstract
Event extraction (EE) is one of the fundamental tasks for information extraction whose goal is to identify mentions of events and their participants in text. Due to its importance, different methods and datasets have been introduced for EE. However, existing EE datasets are limited to formally written documents such as news articles or scientific papers. As such, the challenges of EE in informal and noisy texts are not adequately studied. In particular, video transcripts constitute an important domain that can benefit tremendously from EE systems (e.g., video retrieval), but has not been studied in EE literature due to the lack of necessary datasets. To address this limitation, we propose the first large-scale EE dataset obtained for transcripts of streamed videos on the video hosting platform Behance to promote future research in this area. In addition, we extensively evaluate existing state-of-the-art EE methods on our new dataset. We demonstrate that such systems cannot achieve adequate performance on the proposed dataset, revealing challenges and opportunities for further research effort.
Anthology ID:
2022.coling-1.625
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
7156–7165
Language:
URL:
https://aclanthology.org/2022.coling-1.625
DOI:
Bibkey:
Cite (ACL):
Amir Pouran Ben Veyseh, Viet Dac Lai, Franck Dernoncourt, and Thien Huu Nguyen. 2022. Event Extraction in Video Transcripts. In Proceedings of the 29th International Conference on Computational Linguistics, pages 7156–7165, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Event Extraction in Video Transcripts (Veyseh et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.625.pdf