Uncertainty and Traffic-Aware Active Learning for Semantic Parsing

Priyanka Sen, Emine Yilmaz


Abstract
Collecting training data for semantic parsing is a time-consuming and expensive task. As a result, there is growing interest in industry to reduce the number of annotations required to train a semantic parser, both to cut down on costs and to limit customer data handled by annotators. In this paper, we propose uncertainty and traffic-aware active learning, a novel active learning method that uses model confidence and utterance frequencies from customer traffic to select utterances for annotation. We show that our method significantly outperforms baselines on an internal customer dataset and the Facebook Task Oriented Parsing (TOP) dataset. On our internal dataset, our method achieves the same accuracy as random sampling with 2,000 fewer annotations.
Anthology ID:
2020.intexsempar-1.2
Volume:
Proceedings of the First Workshop on Interactive and Executable Semantic Parsing
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | intexsempar
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–17
Language:
URL:
https://aclanthology.org/2020.intexsempar-1.2
DOI:
10.18653/v1/2020.intexsempar-1.2
Bibkey:
Cite (ACL):
Priyanka Sen and Emine Yilmaz. 2020. Uncertainty and Traffic-Aware Active Learning for Semantic Parsing. In Proceedings of the First Workshop on Interactive and Executable Semantic Parsing, pages 12–17, Online. Association for Computational Linguistics.
Cite (Informal):
Uncertainty and Traffic-Aware Active Learning for Semantic Parsing (Sen & Yilmaz, intexsempar 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.intexsempar-1.2.pdf
Video:
 https://slideslive.com/38939454