Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources

Mohammad Taher Pilehvar, Nigel Collier


Abstract
We put forward an approach that exploits the knowledge encoded in lexical resources in order to induce representations for words that were not encountered frequently during training. Our approach provides an advantage over the past work in that it enables vocabulary expansion not only for morphological variations, but also for infrequent domain specific terms. We performed evaluations in different settings, showing that the technique can provide consistent improvements on multiple benchmarks across domains.
Anthology ID:
E17-2062
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
388–393
Language:
URL:
https://aclanthology.org/E17-2062
DOI:
Bibkey:
Cite (ACL):
Mohammad Taher Pilehvar and Nigel Collier. 2017. Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 388–393, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources (Pilehvar & Collier, EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-2062.pdf