Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Subhadarshi Panda, Caglar Tirkaz, Tobias Falke, Patrick Lehnen


Abstract
The lack of labeled training data for new features is a common problem in rapidly changing real-world dialog systems. As a solution, we propose a multilingual paraphrase generation model that can be used to generate novel utterances for a target feature and target language. The generated utterances can be used to augment existing training data to improve intent classification and slot labeling models. We evaluate the quality of generated utterances using intrinsic evaluation metrics and by conducting downstream evaluation experiments with English as the source language and nine different target languages. Our method shows promise across languages, even in a zero-shot setting where no seed data is available.
Anthology ID:
2021.nlp4convai-1.4
Volume:
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI
Month:
November
Year:
2021
Address:
Online
Venues:
EMNLP | NLP4ConvAI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–39
Language:
URL:
https://aclanthology.org/2021.nlp4convai-1.4
DOI:
10.18653/v1/2021.nlp4convai-1.4
Bibkey:
Cite (ACL):
Subhadarshi Panda, Caglar Tirkaz, Tobias Falke, and Patrick Lehnen. 2021. Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems. In Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI, pages 30–39, Online. Association for Computational Linguistics.
Cite (Informal):
Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems (Panda et al., NLP4ConvAI 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4convai-1.4.pdf