More Than Spoken Words: Nonverbal Message Extraction and Generation

Dian Yu, Xiaoyang Wang, Wanshun Chen, Nan Du, Longyue Wang, Haitao Mi, Dong Yu


Abstract
Nonverbal messages (NM) such as speakers’ facial expressions and speed of speech are essential for face-to-face communication, and they can be regarded as implicit knowledge as they are usually not included in existing dialogue understanding or generation tasks. This paper introduces the task of extracting NMs in written text and generating NMs for spoken text. Previous studies merely focus on extracting NMs from relatively small-scale well-structured corpora such as movie scripts wherein NMs are enclosed in parentheses by scriptwriters, which greatly decreases the difficulty of extraction. To enable extracting NMs from unstructured corpora, we annotate the first NM extraction dataset for Chinese based on novels and develop three baselines to extract single-span or multi-span NM of a target utterance from its surrounding context. Furthermore, we use the extractors to extract 749K (context, utterance, NM) triples from Chinese novels and investigate whether we can use them to improve NM generation via semi-supervised learning. Experimental results demonstrate that the automatically extracted triples can serve as high-quality augmentation data of clean triples extracted from scripts to generate more relevant, fluent, valid, and factually consistent NMs than the purely supervised generator, and the resulting generator can in turn help Chinese dialogue understanding tasks such as dialogue machine reading comprehension and emotion classification by simply adding the predicted “unspoken” NM to each utterance or narrative in inputs.
Anthology ID:
2023.emnlp-main.1021
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16396–16413
Language:
URL:
https://aclanthology.org/2023.emnlp-main.1021
DOI:
10.18653/v1/2023.emnlp-main.1021
Bibkey:
Cite (ACL):
Dian Yu, Xiaoyang Wang, Wanshun Chen, Nan Du, Longyue Wang, Haitao Mi, and Dong Yu. 2023. More Than Spoken Words: Nonverbal Message Extraction and Generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16396–16413, Singapore. Association for Computational Linguistics.
Cite (Informal):
More Than Spoken Words: Nonverbal Message Extraction and Generation (Yu et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.1021.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.1021.mp4