Towards a Seamless Integration of Word Senses into Downstream NLP Applications

Mohammad Taher Pilehvar, Jose Camacho-Collados, Roberto Navigli, Nigel Collier


Abstract
Lexical ambiguity can impede NLP systems from accurate understanding of semantics. Despite its potential benefits, the integration of sense-level information into NLP systems has remained understudied. By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show that a simple disambiguation of the input text can lead to consistent performance improvement on multiple topic categorization and polarity detection datasets, particularly when the fine granularity of the underlying sense inventory is reduced and the document is sufficiently large. Our results also point to the need for sense representation research to focus more on in vivo evaluations which target the performance in downstream NLP applications rather than artificial benchmarks.
Anthology ID:
P17-1170
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1857–1869
Language:
URL:
https://aclanthology.org/P17-1170/
DOI:
10.18653/v1/P17-1170
Bibkey:
Cite (ACL):
Mohammad Taher Pilehvar, Jose Camacho-Collados, Roberto Navigli, and Nigel Collier. 2017. Towards a Seamless Integration of Word Senses into Downstream NLP Applications. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1857–1869, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Towards a Seamless Integration of Word Senses into Downstream NLP Applications (Pilehvar et al., ACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/P17-1170.pdf
Code
 pilehvar/sensecnn
Data
IMDb Movie ReviewsWord Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison