Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

Samuel Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson


Abstract
We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task. This formulation allows for a simple integration of conversational knowledge coded in large pretrained conversational models such as ConveRT (Henderson et al., 2019). We show that leveraging such knowledge in Span-ConveRT is especially useful for few-shot learning scenarios: we report consistent gains over 1) a span extractor that trains representations from scratch in the target domain, and 2) a BERT-based span extractor. In order to inspire more work on span extraction for the slot-filling task, we also release RESTAURANTS-8K, a new challenging data set of 8,198 utterances, compiled from actual conversations in the restaurant booking domain.
Anthology ID:
2020.acl-main.11
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
107–121
Language:
URL:
https://aclanthology.org/2020.acl-main.11
DOI:
10.18653/v1/2020.acl-main.11
Bibkey:
Cite (ACL):
Samuel Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, and Matthew Henderson. 2020. Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 107–121, Online. Association for Computational Linguistics.
Cite (Informal):
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations (Coope et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.11.pdf
Video:
 http://slideslive.com/38929061
Code
 PolyAI-LDN/task-specific-datasets