Article selection using probabilistic sense disambiguation

Hian-Beng Lee


Abstract
A probabilistic method is used for word sense disambiguation where the features taken are the surrounding six words. As their surface forms are used, no syntactic or semantic analysis is required. Despite its simplicity, this method is able to disambiguate the noun interest accurately. Using the common data set of (Bruce & Wiebe 94), we have obtained an average accuracy of 86.6% compared with their reported figure of 78%. This portable technique can be applied to the task of English article selection. This problem arises from machine translation of any source language without article to English. Using texts from the Wall Street Journal, we achieved an overall accuracy of 83.1% for the 1,500 most commonly used head nouns.
Anthology ID:
1999.mtsummit-1.62
Volume:
Proceedings of Machine Translation Summit VII
Month:
September 13-17
Year:
1999
Address:
Singapore, Singapore
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
421–426
Language:
URL:
https://aclanthology.org/1999.mtsummit-1.62
DOI:
Bibkey:
Cite (ACL):
Hian-Beng Lee. 1999. Article selection using probabilistic sense disambiguation. In Proceedings of Machine Translation Summit VII, pages 421–426, Singapore, Singapore.
Cite (Informal):
Article selection using probabilistic sense disambiguation (Lee, MTSummit 1999)
Copy Citation:
PDF:
https://aclanthology.org/1999.mtsummit-1.62.pdf