Enhancing Out-of-Vocabulary Estimation with Subword Attention

Raj Patel, Carlotta Domeniconi


Abstract
Word embedding methods like word2vec and GloVe have been shown to learn strong representations of words. However, these methods only learn representations for words in the training corpus and therefore struggle to handle unknown and new words, known as out-of-vocabulary (OOV) words. As a result, there have been multiple attempts to learn OOV word representations in a similar fashion to how humans learn new words, using word roots/subwords and/or surrounding words. However, while most of these approaches use advanced architectures like attention on the context of the OOV word, they tend to use simple structures like ngram addition or character based convolutional neural networks (CNN) to handle processing subword information. In response to this, we propose SubAtt, a transformer based OOV estimation model that uses attention mechanisms on both the context and the subwords. In addition to attention, we also show that pretraining subword representations also leads to improvement in OOV estimation. We show SubAtt outperforms current state-of-the-art OOV estimation models.
Anthology ID:
2023.findings-acl.221
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3592–3601
Language:
URL:
https://aclanthology.org/2023.findings-acl.221
DOI:
10.18653/v1/2023.findings-acl.221
Bibkey:
Cite (ACL):
Raj Patel and Carlotta Domeniconi. 2023. Enhancing Out-of-Vocabulary Estimation with Subword Attention. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3592–3601, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Enhancing Out-of-Vocabulary Estimation with Subword Attention (Patel & Domeniconi, Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.221.pdf