DCU-NLG-Small at the GEM’24 Data-to-Text Task: Rule-based generation and post-processing with T5-Base

Simon Mille, Mohammed Sabry, Anya Belz


Abstract
Our submission to the GEM data-to-text shared task aims to assess the quality of texts produced by the combination of a rule-based system with a language model of reduced size, by first using a rule-based generator to convert input triples into semantically correct English text, and then a language model to paraphrase these texts to make them more fluent. The texts are translated to languages other than English with the NLLB machine translation system.
Anthology ID:
2024.inlg-genchal.9
Volume:
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Simon Mille, Miruna-Adriana Clinciu
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
84–91
Language:
URL:
https://aclanthology.org/2024.inlg-genchal.9
DOI:
Bibkey:
Cite (ACL):
Simon Mille, Mohammed Sabry, and Anya Belz. 2024. DCU-NLG-Small at the GEM’24 Data-to-Text Task: Rule-based generation and post-processing with T5-Base. In Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges, pages 84–91, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
DCU-NLG-Small at the GEM’24 Data-to-Text Task: Rule-based generation and post-processing with T5-Base (Mille et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-genchal.9.pdf