CHAmbi: A New Benchmark on Chinese Ambiguity Challenges for Large Language Models

Qin Zhang, Sihan Cai, Jiaxu Zhao, Mykola Pechenizkiy, Meng Fang


Abstract
Ambiguity is an inherent feature of language, whose management is crucial for effective communication and collaboration. This is particularly true for Chinese, a language with extensive lexical-morphemic ambiguity. Despite the wide use of large language models (LLMs) in numerous domains and their growing proficiency in Chinese, there is a notable lack of datasets to thoroughly evaluate LLMs’ ability to handle ambiguity in Chinese. To bridge this gap, we introduce the CHAmbi dataset, a specialized Chinese multi-label disambiguation dataset formatted in Natural Language Inference. It comprises 4,991 pairs of premises and hypotheses, including 824 examples featuring a wide range of ambiguities. In addition to the dataset, we develop a series of tests and conduct an extensive evaluation of pre-trained LLMs’ proficiency in identifying and resolving ambiguity in the Chinese language. Our findings reveal that GPT-4 consistently delivers commendable performance across various evaluative measures, albeit with limitations in robustness. The performances of other LLMs, however, demonstrate variability in handling ambiguity-related tasks, underscoring the complexity of such tasks in the context of Chinese. The overall results highlight the challenge of ambiguity handling for current LLMs and underscore the imperative need for further enhancement in LLM capabilities for effective ambiguity resolution in the Chinese language.
Anthology ID:
2024.findings-emnlp.875
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14883–14898
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.875
DOI:
Bibkey:
Cite (ACL):
Qin Zhang, Sihan Cai, Jiaxu Zhao, Mykola Pechenizkiy, and Meng Fang. 2024. CHAmbi: A New Benchmark on Chinese Ambiguity Challenges for Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 14883–14898, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
CHAmbi: A New Benchmark on Chinese Ambiguity Challenges for Large Language Models (Zhang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.875.pdf