Evaluating RDF-to-text Generation Models for English and Russian on Out Of Domain Data

Anna Nikiforovskaya, Claire Gardent


Abstract
While the WebNLG dataset has prompted much research on generation from knowledge graphs, little work has examined how well models trained on the WebNLG data generalise to unseen data and work has mostly been focused on English. In this paper, we introduce novel benchmarks for both English and Russian which contain various ratios of unseen entities and properties. These benchmarks also differ from WebNLG in that some of the graphs stem from Wikidata rather than DBpedia. Evaluating various models for English and Russian on these benchmarks shows a strong decrease in performance while a qualitative analysis highlights the various types of errors induced by non i.i.d data.
Anthology ID:
2024.inlg-main.11
Volume:
Proceedings of the 17th International Natural Language Generation Conference
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
134–144
Language:
URL:
https://aclanthology.org/2024.inlg-main.11
DOI:
Bibkey:
Cite (ACL):
Anna Nikiforovskaya and Claire Gardent. 2024. Evaluating RDF-to-text Generation Models for English and Russian on Out Of Domain Data. In Proceedings of the 17th International Natural Language Generation Conference, pages 134–144, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Evaluating RDF-to-text Generation Models for English and Russian on Out Of Domain Data (Nikiforovskaya & Gardent, INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-main.11.pdf
Supplementary attachment:
 2024.inlg-main.11.Supplementary_Attachment.pdf