SLURP: A Spoken Language Understanding Resource Package

Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, Verena Rieser


Abstract
Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation and misunderstandings in end-user applications. However, publicly available SLU resources are limited. In this paper, we release SLURP, a new SLU package containing the following: (1) A new challenging dataset in English spanning 18 domains, which is substantially bigger and linguistically more diverse than existing datasets; (2) Competitive baselines based on state-of-the-art NLU and ASR systems; (3) A new transparent metric for entity labelling which enables a detailed error analysis for identifying potential areas of improvement. SLURP is available at https://github.com/pswietojanski/slurp.
Anthology ID:
2020.emnlp-main.588
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7252–7262
Language:
URL:
https://aclanthology.org/2020.emnlp-main.588
DOI:
10.18653/v1/2020.emnlp-main.588
Bibkey:
Cite (ACL):
Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, and Verena Rieser. 2020. SLURP: A Spoken Language Understanding Resource Package. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7252–7262, Online. Association for Computational Linguistics.
Cite (Informal):
SLURP: A Spoken Language Understanding Resource Package (Bastianelli et al., EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.588.pdf
Video:
 https://slideslive.com/38939295
Code
 pswietojanski/slurp
Data
SLURP