Distributional Thesauri for Information Retrieval and vice versa

Vincent Claveau, Ewa Kijak


Abstract
Distributional thesauri are useful in many tasks of Natural Language Processing. In this paper, we address the problem of building and evaluating such thesauri with the help of Information Retrieval (IR) concepts. Two main contributions are proposed. First, following the work of [8], we show how IR tools and concepts can be used with success to build a thesaurus. Through several experiments and by evaluating directly the results with reference lexicons, we show that some IR models outperform state-of-the-art systems. Secondly, we use IR as an applicative framework to indirectly evaluate the generated thesaurus. Here again, this task-based evaluation validates the IR approach used to build the thesaurus. Moreover, it allows us to compare these results with those from the direct evaluation framework used in the literature. The observed differences bring these evaluation habits into question.
Anthology ID:
L16-1588
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3709–3716
Language:
URL:
https://aclanthology.org/L16-1588
DOI:
Bibkey:
Cite (ACL):
Vincent Claveau and Ewa Kijak. 2016. Distributional Thesauri for Information Retrieval and vice versa. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3709–3716, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Distributional Thesauri for Information Retrieval and vice versa (Claveau & Kijak, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1588.pdf