Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events

Priyanka Kargupta; Yunyi Zhang; Yizhu Jiao; Siru Ouyang; Jiawei Han

doi:10.18653/v1/2025.acl-long.1433

Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events

Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han

Abstract

State-of-the-art automatic event detection struggles with interpretability and adaptability to evolving large-scale key events—unlike episodic structures, which excel in these areas. Often overlooked, episodes represent cohesive clusters of core entities performing actions at a specific time and location; a partially ordered sequence of episodes can represent a key event. This paper introduces a novel task, **episode detection**, which identifies episodes within a news corpus of key event articles. Detecting episodes poses unique challenges, as they lack explicit temporal or locational markers and cannot be merged using semantic similarity alone. While large language models (LLMs) can aid with these reasoning difficulties, they suffer with long contexts typical of news corpora. To address these challenges, we introduce **EpiMine**, an unsupervised framework that identifies a key event’s candidate episodes by leveraging natural episodic partitions in articles, estimated through shifts in discriminative term combinations. These candidate episodes are more cohesive and representative of true episodes, synergizing with LLMs to better interpret and refine them into final episodes. We apply EpiMine to our three diverse, real-world event datasets annotated at the episode level, where it achieves a 59.2% average gain across all metrics compared to baselines.

Anthology ID:: 2025.acl-long.1433
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 29648–29663
Language:
URL:: https://aclanthology.org/2025.acl-long.1433/
DOI:: 10.18653/v1/2025.acl-long.1433
Bibkey:
Cite (ACL):: Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, and Jiawei Han. 2025. Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 29648–29663, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events (Kargupta et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1433.pdf

PDF Cite Search Fix data