New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain

Touhidul Alam, Alessandra Zarcone, Sebastian Padó


Abstract
Reliable tagging of Temporal Expressions (TEs, e.g., Book a table at L’Osteria for Sunday evening) is a central requirement for Voice Assistants (VAs). However, there is a dearth of resources and systems for the VA domain, since publicly-available temporal taggers are trained only on substantially different domains, such as news and clinical text. Since the cost of annotating large datasets is prohibitive, we investigate the trade-off between in-domain data and performance in DA-Time, a hybrid temporal tagger for the English VA domain which combines a neural architecture for robust TE recognition, with a parser-based TE normalizer. We find that transfer learning goes a long way even with as little as 25 in-domain sentences: DA-Time performs at the state of the art on the news domain, and substantially outperforms it on the VA domain.
Anthology ID:
2021.iwcs-1.14
Volume:
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
Month:
June
Year:
2021
Address:
Groningen, The Netherlands (online)
Venue:
IWCS
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
144–154
Language:
URL:
https://aclanthology.org/2021.iwcs-1.14
DOI:
Bibkey:
Cite (ACL):
Touhidul Alam, Alessandra Zarcone, and Sebastian Padó. 2021. New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain. In Proceedings of the 14th International Conference on Computational Semantics (IWCS), pages 144–154, Groningen, The Netherlands (online). Association for Computational Linguistics.
Cite (Informal):
New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain (Alam et al., IWCS 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.iwcs-1.14.pdf
Code
 audiolabs/da-time
Data
SNIPS