How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation?

Xunjian Yin, Xiaojun Wan


Abstract
With the rapid development of deep learning, Seq2Seq paradigm has become prevalent for end-to-end data-to-text generation, and the BLEU scores have been increasing in recent years. However, it is widely recognized that there is still a gap between the quality of the texts generated by models and the texts written by human. In order to better understand the ability of Seq2Seq models, evaluate their performance and analyze the results, we choose to use Multidimensional Quality Metric(MQM) to evaluate several representative Seq2Seq models on end-to-end data-to-text generation. We annotate the outputs of five models on four datasets with eight error types and find that 1) copy mechanism is helpful for the improvement in Omission and Inaccuracy Extrinsic errors but it increases other types of errors such as Addition; 2) pre-training techniques are highly effective, and pre-training strategy and model size are very significant; 3) the structure of the dataset also influences the model’s performance greatly; 4) some specific types of errors are generally challenging for seq2seq models.
Anthology ID:
2022.acl-long.531
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7701–7710
Language:
URL:
https://aclanthology.org/2022.acl-long.531
DOI:
10.18653/v1/2022.acl-long.531
Bibkey:
Cite (ACL):
Xunjian Yin and Xiaojun Wan. 2022. How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation?. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7701–7710, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation? (Yin & Wan, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.531.pdf
Code
 xunjianyin/seq2seqondata2text
Data
ToTToWikiBio