Local-Global Vectors to Improve Unigram Terminology Extraction
Ehsan Amjadian | Diana Inkpen | Tahereh Paribakht | Farahnaz Faez
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand.