LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World

Sina Semnani; Pingyue Zhang; Wanyue Zhai; Haozhuo Li; Ryan Beauchamp; Trey Billing; Katayoun Kishi; Manling Li; Monica Lam

doi:10.18653/v1/2025.findings-acl.1325

LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World

Sina Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, Monica Lam

Abstract

This paper presents LEMONADE, a large-scale conflict event dataset comprising 39,786 events across 20 languages and 171 countries, with extensive coverage of region-specific entities. LEMONADE is based on a partially reannotated subset of the Armed Conflict Location & Event Data (ACLED), which has documented global conflict events for over a decade.To address the challenge of aggregating multilingual sources for global event analysis, we introduce abstractive event extraction (AEE) and its subtask, abstractive entity linking (AEL). Unlike conventional span-based event extraction, our approach detects event arguments and entities through holistic document understanding and normalizes them across the multilingual dataset. We evaluate various large language models (LLMs) on these tasks, adapt existing zero-shot event extraction systems, and benchmark supervised models. Additionally, we introduce ZEST, a novel zero-shot retrieval-based system for AEL.Our best zero-shot system achieves an end-to-end F1 score of 58.3%, with LLMs outperforming specialized event extraction models such as GoLLIE. For entity linking, ZEST achieves an F1 score of 45.7%, significantly surpassing OneNet, a state-of-the-art zero-shot baseline that achieves only 23.7%. However, these zero-shot results lag behind the best supervised systems by 20.1% and 37.0% in the end-to-end and AEL tasks, respectively, highlighting the need for further research.

Anthology ID:: 2025.findings-acl.1325
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25813–25852
Language:
URL:: https://aclanthology.org/2025.findings-acl.1325/
DOI:: 10.18653/v1/2025.findings-acl.1325
Bibkey:
Cite (ACL):: Sina Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, and Monica Lam. 2025. LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World. In Findings of the Association for Computational Linguistics: ACL 2025, pages 25813–25852, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World (Semnani et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.1325.pdf

PDF Cite Search Fix data