A Comparative Evaluation of Word Sense Disambiguation Algorithms for German

Verena Henrich, Erhard Hinrichs


Abstract
The present paper explores a wide range of word sense disambiguation (WSD) algorithms for German. These WSD algorithms are based on a suite of semantic relatedness measures, including path-based, information-content-based, and gloss-based methods. Since the individual algorithms produce diverse results in terms of precision and thus complement each other well in terms of coverage, a set of combined algorithms is investigated and compared in performance to the individual algorithms. Among the single algorithms considered, a word overlap method derived from the Lesk algorithm that uses Wiktionary glosses and GermaNet lexical fields yields the best F-score of 56.36. This result is outperformed by a combined WSD algorithm that uses weighted majority voting and obtains an F-score of 63.59. The WSD experiments utilize the German wordnet GermaNet as a sense inventory as well as WebCAGe (short for: Web-Harvested Corpus Annotated with GermaNet Senses), a newly constructed, sense-annotated corpus for this language. The WSD experiments also confirm that WSD performance is lower for words with fine-grained sense distinctions compared to words with coarse-grained senses.
Anthology ID:
L12-1031
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
576–583
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/164_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Verena Henrich and Erhard Hinrichs. 2012. A Comparative Evaluation of Word Sense Disambiguation Algorithms for German. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 576–583, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
A Comparative Evaluation of Word Sense Disambiguation Algorithms for German (Henrich & Hinrichs, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/164_Paper.pdf