MTL-SLT: Multi-Task Learning for Spoken Language Tasks

Zhiqi Huang, Milind Rao, Anirudh Raju, Zhe Zhang, Bach Bui, Chul Lee


Abstract
Language understanding in speech-based systems has attracted extensive interest from both academic and industrial communities in recent years with the growing demand for voice-based applications. Prior works focus on independent research by the automatic speech recognition (ASR) and natural language processing (NLP) communities, or on jointly modeling the speech and NLP problems focusing on a single dataset or single NLP task. To facilitate the development of spoken language research, we introduce MTL-SLT, a multi-task learning framework for spoken language tasks. MTL-SLT takes speech as input, and outputs transcription, intent, named entities, summaries, and answers to text queries, supporting the tasks of spoken language understanding, spoken summarization and spoken question answering respectively. The proposed framework benefits from three key aspects: 1) pre-trained sub-networks of ASR model and language model; 2) multi-task learning objective to exploit shared knowledge from different tasks; 3) end-to-end training of ASR and downstream NLP task based on sequence loss. We obtain state-of-the-art results on spoken language understanding tasks such as SLURP and ATIS. Spoken summarization results are reported on a new dataset: Spoken-Gigaword.
Anthology ID:
2022.nlp4convai-1.11
Volume:
Proceedings of the 4th Workshop on NLP for Conversational AI
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bing Liu, Alexandros Papangelis, Stefan Ultes, Abhinav Rastogi, Yun-Nung Chen, Georgios Spithourakis, Elnaz Nouri, Weiyan Shi
Venue:
NLP4ConvAI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–130
Language:
URL:
https://aclanthology.org/2022.nlp4convai-1.11
DOI:
10.18653/v1/2022.nlp4convai-1.11
Bibkey:
Cite (ACL):
Zhiqi Huang, Milind Rao, Anirudh Raju, Zhe Zhang, Bach Bui, and Chul Lee. 2022. MTL-SLT: Multi-Task Learning for Spoken Language Tasks. In Proceedings of the 4th Workshop on NLP for Conversational AI, pages 120–130, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
MTL-SLT: Multi-Task Learning for Spoken Language Tasks (Huang et al., NLP4ConvAI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.nlp4convai-1.11.pdf
Video:
 https://aclanthology.org/2022.nlp4convai-1.11.mp4
Data
ATISGLUELibriSpeechSLURPSpoken-SQuAD