Generating Racing Game Commentary from Vision, Language, and Structured Data

Tatsuya Ishigaki, Goran Topic, Yumi Hamazono, Hiroshi Noji, Ichiro Kobayashi, Yusuke Miyao, Hiroya Takamura


Abstract
We propose the task of automatically generating commentaries for races in a motor racing game, from vision, structured numerical, and textual data. Commentaries provide information to support spectators in understanding events in races. Commentary generation models need to interpret the race situation and generate the correct content at the right moment. We divide the task into two subtasks: utterance timing identification and utterance generation. Because existing datasets do not have such alignments of data in multiple modalities, this setting has not been explored in depth. In this study, we introduce a new large-scale dataset that contains aligned video data, structured numerical data, and transcribed commentaries that consist of 129,226 utterances in 1,389 races in a game. Our analysis reveals that the characteristics of commentaries change over time or from viewpoints. Our experiments on the subtasks show that it is still challenging for a state-of-the-art vision encoder to capture useful information from videos to generate accurate commentaries. We make the dataset and baseline implementation publicly available for further research.
Anthology ID:
2021.inlg-1.11
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
103–113
Language:
URL:
https://aclanthology.org/2021.inlg-1.11
DOI:
10.18653/v1/2021.inlg-1.11
Bibkey:
Cite (ACL):
Tatsuya Ishigaki, Goran Topic, Yumi Hamazono, Hiroshi Noji, Ichiro Kobayashi, Yusuke Miyao, and Hiroya Takamura. 2021. Generating Racing Game Commentary from Vision, Language, and Structured Data. In Proceedings of the 14th International Conference on Natural Language Generation, pages 103–113, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
Generating Racing Game Commentary from Vision, Language, and Structured Data (Ishigaki et al., INLG 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.inlg-1.11.pdf