Long-Form Analogy Evaluation Challenge

Bhavya Bhavya, Chris Palaguachi, Yang Zhou, Suma Bhat, ChengXiang Zhai


Abstract
Given the practical applications of analogies, recent work has studied analogy generation to explain concepts. However, not all generated analogies are of high quality and it is unclear how to measure the quality of this new kind of generated text. To address this challenge, we propose a shared task on automatically evaluating the quality of generated analogies based on seven comprehensive criteria. For this, we will set up a leader board based on our dataset annotated with manual ratings along the seven criteria, and provide a baseline solution leveraging GPT-4. We hope that this task would advance the progress in development of new evaluation metrics and methods for analogy generation in natural language, particularly for education.
Anthology ID:
2024.inlg-genchal.1
Volume:
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Simon Mille, Miruna-Adriana Clinciu
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–16
Language:
URL:
https://aclanthology.org/2024.inlg-genchal.1
DOI:
Bibkey:
Cite (ACL):
Bhavya Bhavya, Chris Palaguachi, Yang Zhou, Suma Bhat, and ChengXiang Zhai. 2024. Long-Form Analogy Evaluation Challenge. In Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges, pages 1–16, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Long-Form Analogy Evaluation Challenge (Bhavya et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-genchal.1.pdf