Olesia Nedopas


2024

pdf bib
Exploring the impact of data representation on neural data-to-text generation
David M. Howcroft | Lewis N. Watson | Olesia Nedopas | Dimitra Gkatzia
Proceedings of the 17th International Natural Language Generation Conference

A relatively under-explored area in research on neural natural language generation is the impact of the data representation on text quality. Here we report experiments on two leading input representations for data-to-text generation: attribute-value pairs and Resource Description Framework (RDF) triples. Evaluating the performance of encoder-decoder seq2seq models as well as recent large language models (LLMs) with both automated metrics and human evaluation, we find that the input representation does not seem to have a large impact on the performance of either purpose-built seq2seq models or LLMs. Finally, we present an error analysis of the texts generated by the LLMs and provide some insights into where these models fail.