Binary Encoded Word Mover’s Distance

Christian Johnson


Abstract
Word Mover’s Distance is a textual distance metric which calculates the minimum transport cost between two sets of word embeddings. This metric achieves impressive results on semantic similarity tasks, but is slow and difficult to scale due to the large number of floating point calculations. This paper demonstrates that by combining pre-existing lower bounds with binary encoded word vectors, the metric can be rendered highly efficient in terms of computation time and memory while still maintaining accuracy on several textual similarity tasks.
Anthology ID:
2022.repl4nlp-1.17
Volume:
Proceedings of the 7th Workshop on Representation Learning for NLP
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Spandana Gella, He He, Bodhisattwa Prasad Majumder, Burcu Can, Eleonora Giunchiglia, Samuel Cahyawijaya, Sewon Min, Maximilian Mozes, Xiang Lorraine Li, Isabelle Augenstein, Anna Rogers, Kyunghyun Cho, Edward Grefenstette, Laura Rimell, Chris Dyer
Venue:
RepL4NLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–172
Language:
URL:
https://aclanthology.org/2022.repl4nlp-1.17
DOI:
10.18653/v1/2022.repl4nlp-1.17
Bibkey:
Cite (ACL):
Christian Johnson. 2022. Binary Encoded Word Mover’s Distance. In Proceedings of the 7th Workshop on Representation Learning for NLP, pages 167–172, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Binary Encoded Word Mover’s Distance (Johnson, RepL4NLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.repl4nlp-1.17.pdf
Video:
 https://aclanthology.org/2022.repl4nlp-1.17.mp4