Differences in Semantic Errors Made by Different Types of Data-to-text Systems

Rudali Huidrom, Anya Belz, Michela Lorandi


Abstract
In this paper, we investigate how different semantic, or content-related, errors made by different types of data-to-text systems differ in terms of number and type. In total, we examine 15 systems: three rule-based and 12 neural systems including two large language models without training or fine-tuning. All systems were tested on the English WebNLG dataset version 3.0. We use a semantic error taxonomy and the brat annotation tool to obtain word-span error annotations on a sample of system outputs. The annotations enable us to establish how many semantic errors different (types of) systems make and what specific types of errors they make, and thus to get an overall understanding of semantic strengths and weaknesses among various types of NLG systems. Among our main findings, we observe that symbolic (rule and template-based) systems make fewer semantic errors overall, non-LLM neural systems have better fluency and data coverage, but make more semantic errors, while LLM-based systems require improvement particularly in addressing superfluous.
Anthology ID:
2024.inlg-main.47
Volume:
Proceedings of the 17th International Natural Language Generation Conference
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
609–621
Language:
URL:
https://aclanthology.org/2024.inlg-main.47
DOI:
Bibkey:
Cite (ACL):
Rudali Huidrom, Anya Belz, and Michela Lorandi. 2024. Differences in Semantic Errors Made by Different Types of Data-to-text Systems. In Proceedings of the 17th International Natural Language Generation Conference, pages 609–621, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Differences in Semantic Errors Made by Different Types of Data-to-text Systems (Huidrom et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-main.47.pdf