ClimateEval: A Comprehensive Benchmark for NLP Tasks Related to Climate Change

Murathan Kurfalı; Shorouq Zahra; Joakim Nivre; Gabriele Messori

doi:10.18653/v1/2025.climatenlp-1.13

ClimateEval: A Comprehensive Benchmark for NLP Tasks Related to Climate Change

Murathan Kurfali, Shorouq Zahra, Joakim Nivre, Gabriele Messori

Abstract

ClimateEval is a comprehensive benchmark designed to evaluate natural language processing models across a broad range of tasks related to climate change. ClimateEval aggregates existing datasets along with a newly developed news classification dataset, created specifically for this release. This results in a benchmark of 25 tasks based on 13 datasets, covering key aspects of climate discourse, including text classification, question answering, and information extraction. Our benchmark provides a standardized evaluation suite for systematically assessing the performance of large language models (LLMs) on these tasks. Additionally, we conduct an extensive evaluation of open-source LLMs (ranging from 2B to 70B parameters) in both zero-shot and few-shot settings, analyzing their strengths and limitations in the domain of climate change.

Anthology ID:: 2025.climatenlp-1.13
Volume:: Proceedings of the 2nd Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Kalyan Dutia, Peter Henderson, Markus Leippold, Christoper Manning, Gaku Morio, Veruska Muccione, Jingwei Ni, Tobias Schimanski, Dominik Stammbach, Alok Singh, Alba (Ruiran) Su, Saeid A. Vaghefi
Venues:: ClimateNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 194–207
Language:
URL:: https://aclanthology.org/2025.climatenlp-1.13/
DOI:: 10.18653/v1/2025.climatenlp-1.13
Bibkey:
Cite (ACL):: Murathan Kurfali, Shorouq Zahra, Joakim Nivre, and Gabriele Messori. 2025. ClimateEval: A Comprehensive Benchmark for NLP Tasks Related to Climate Change. In Proceedings of the 2nd Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2025), pages 194–207, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: ClimateEval: A Comprehensive Benchmark for NLP Tasks Related to Climate Change (Kurfali et al., ClimateNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.climatenlp-1.13.pdf

PDF Cite Search Fix data