OSU CompLing at the GEM’24 Data-to-Text Task

Alyssa Allen, Ashley Lewis, Yi-Chien Lin, Tomiris Kaumenova, Michael White


Abstract
This paper details experiments conducted for completing the GEM 2024 Data-to-Text task for a WebNLG dataset (Gardent et al., 2017). We show that model performance varies greatly across English, Spanish, Chinese, and Russian. Data filtering was done with automatic model judgments via error detection, which performs differently per language. We report English and Spanish dev set results for a data filtering and knowledge distillation approach to generating natural language outputs for sets of triples across a variety of domains. Specifically, we compare three generation conditions: 1) few-shot prompting with ChatGPT (GPT4), 2) fine-tuning LLama2 on the unfiltered dataset, and 3) fine-tuning Llama2 on a filtered version of the dataset. Russian and Chinese efforts did not result in submissions due to inconsistent or incoherent translations being produced in either the data synthesis or final generation stages. We provide details on these shortcomings but largely focus on Spanish and English efforts that align with our task submissions. We ultimately submitted outputs in English and Spanish that were generated using a version of Llama2 fine-tuned on a filtered dataset.
Anthology ID:
2024.inlg-genchal.11
Volume:
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Simon Mille, Miruna-Adriana Clinciu
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
100–111
Language:
URL:
https://aclanthology.org/2024.inlg-genchal.11
DOI:
Bibkey:
Cite (ACL):
Alyssa Allen, Ashley Lewis, Yi-Chien Lin, Tomiris Kaumenova, and Michael White. 2024. OSU CompLing at the GEM’24 Data-to-Text Task. In Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges, pages 100–111, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
OSU CompLing at the GEM’24 Data-to-Text Task (Allen et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-genchal.11.pdf