OpenForecast: A Large-Scale Open-Ended Event Forecasting Dataset

Zhen Wang, Xi Zhou, Yating Yang, Bo Ma, Lei Wang, Rui Dong, Azmat Anwar


Abstract
Complex events generally exhibit unforeseen, multifaceted, and multi-step developments, and cannot be well handled by existing closed-ended event forecasting methods, which are constrained by a limited answer space. In order to accelerate the research on complex event forecasting, we introduce OpenForecast, a large-scale open-ended dataset with two features: (1) OpenForecast defines three open-ended event forecasting tasks, enabling unforeseen, multifaceted, and multi-step forecasting. (2) OpenForecast collects and annotates a large-scale dataset from Wikipedia and news, including 43,419 complex events spanning from 1950 to 2024. Particularly, this annotation can be completed automatically without any manual annotation cost. Meanwhile, we introduce an automatic LLM-based Retrieval-Augmented Evaluation method (LRAE) for complex events, enabling OpenForecast to evaluate the ability of complex event forecasting of large language models. Finally, we conduct comprehensive human evaluations to verify the quality and challenges of OpenForecast, and the consistency between LEAE metric and human evaluation. OpenForecast and related codes will be publicly released.
Anthology ID:
2025.coling-main.353
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5273–5294
Language:
URL:
https://aclanthology.org/2025.coling-main.353/
DOI:
Bibkey:
Cite (ACL):
Zhen Wang, Xi Zhou, Yating Yang, Bo Ma, Lei Wang, Rui Dong, and Azmat Anwar. 2025. OpenForecast: A Large-Scale Open-Ended Event Forecasting Dataset. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5273–5294, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
OpenForecast: A Large-Scale Open-Ended Event Forecasting Dataset (Wang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.353.pdf