Brian Timoney
2024
Pipeline Neural Data-to-text with Large Language Models
Chinonso Cynthia Osuji
|
Brian Timoney
|
Thiago Castro Ferreira
|
Brian Davis
Proceedings of the 17th International Natural Language Generation Conference
Previous studies have highlighted the advantages of pipeline neural architectures over end-to-end models, particularly in reducing text hallucination. In this study, we extend prior research by integrating pretrained language models (PLMs) into a pipeline framework, using both fine-tuning and prompting methods. Our findings show that fine-tuned PLMs consistently generate high quality text, especially within end-to-end architectures and at intermediate stages of the pipeline across various domains. These models also outperform prompt-based ones on automatic evaluation metrics but lag in human evaluations. Compared to the standard five-stage pipeline architecture, a streamlined three-stage pipeline, which only include ordering, structuring, and surface realization, achieves superior performance in fluency and semantic adequacy according to the human evaluation.
Imaginary Numbers! Evaluating Numerical Referring Expressions by Neural End-to-End Surface Realization Systems
Rossana Cunha
|
Osuji Chinonso
|
João Campos
|
Brian Timoney
|
Brian Davis
|
Fabio Cozman
|
Adriana Pagano
|
Thiago Castro Ferreira
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP
Neural end-to-end surface realizers output more fluent texts than classical architectures. However, they tend to suffer from adequacy problems, in particular hallucinations in numerical referring expression generation. This poses a problem to language generation in sensitive domains, as is the case of robot journalism covering COVID-19 and Amazon deforestation. We propose an approach whereby numerical referring expressions are converted from digits to plain word form descriptions prior to being fed to state-of-the-art Large Language Models. We conduct automatic and human evaluations to report the best strategy to numerical superficial realization. Code and data are publicly available.
Search
Fix data
Co-authors
- Thiago Castro Ferreira 2
- Brian Davis 2
- João Campos 1
- Osuji Chinonso 1
- Fabio Cozman 1
- show all...