Leros: Learning Explicit Reasoning on Synthesized Data for Commonsense Question Answering

Chenhao Wang, Pengfei Cao, Jiachun Li, Yubo Chen, Kang Liu, Xiaojian Jiang, Jiexin Xu, Li Qiuxia, Jun Zhao


Abstract
Recent work shows large language models can be prompted to generate useful rationales for commonsense question answering (CQA), which can improve the performance of both themselves and other models. However, the cost of deployment and further tuning is relatively expensive for the large models. Some work explores to distill the the rationale-generation ability to convenient small-sized models, yet it typically requires human-authored QA instances during the distillation. In this paper, we propose a novel framework that leverages both knowledge graphs and large language models to synthesize rationale-augmented CQA data. Based on it, we train Leros, a model that can generate helpful rationales to assist generic QA models to accomplish unseen CQA tasks. Empirical results demonstrate Leros can substantially enhance the performance of QA models on five unseen CQA benchmarks, providing better gains than both same-sized counterpart models trained with downstream data and 10x larger language models. Our work reveals a novel way to integrate knowledge from both knowledge graphs and large language models into smaller models. The codes and synthesized resources are publicly available at https://github.com/wchrepo/leros.
Anthology ID:
2024.lrec-main.900
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
10303–10315
Language:
URL:
https://aclanthology.org/2024.lrec-main.900
DOI:
Bibkey:
Cite (ACL):
Chenhao Wang, Pengfei Cao, Jiachun Li, Yubo Chen, Kang Liu, Xiaojian Jiang, Jiexin Xu, Li Qiuxia, and Jun Zhao. 2024. Leros: Learning Explicit Reasoning on Synthesized Data for Commonsense Question Answering. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 10303–10315, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Leros: Learning Explicit Reasoning on Synthesized Data for Commonsense Question Answering (Wang et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.900.pdf