Choosing the correct paradigm for unknown words in rule-based machine translation systems

V. M. Sánchez-Cartagena, M. Esplà-Gomis, F. Sánchez-Martínez, J. A. Pérez-Ortiz


Abstract
Previous work on an interactive system aimed at helping non-expert users to enlarge the monolingual dictionaries of rule-based machine translation (MT) systems worked by discarding those inflection paradigms that cannot generate a set of inflected word forms validated by the user. This method, however, cannot deal with the common case where a set of different paradigms generate exactly the same set of inflected word forms, although with different inflection information attached. In this paper, we propose the use of an n-gram-based model of lexical categories and inflection information to select a single paradigm in cases where more than one paradigm generates the same set of word forms. Results obtained with a Spanish monolingual dictionary show that the correct paradigm is chosen for around 75% of the unknown words, thus making the resulting system (available under an open-source license) of valuable help to enlarge the monolingual dictionaries used in MT involving non-expert users without technical linguistic knowledge.
Anthology ID:
2012.freeopmt-1.4
Volume:
Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation
Month:
June 13-15
Year:
2012
Address:
Gothenburg, Sweden
Editors:
Cristina España-Bonet, Aarne Ranta
Venue:
FreeOpMT
SIG:
Publisher:
Note:
Pages:
27–40
Language:
URL:
https://aclanthology.org/2012.freeopmt-1.4
DOI:
Bibkey:
Cite (ACL):
V. M. Sánchez-Cartagena, M. Esplà-Gomis, F. Sánchez-Martínez, and J. A. Pérez-Ortiz. 2012. Choosing the correct paradigm for unknown words in rule-based machine translation systems. In Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation, pages 27–40, Gothenburg, Sweden.
Cite (Informal):
Choosing the correct paradigm for unknown words in rule-based machine translation systems (Sánchez-Cartagena et al., FreeOpMT 2012)
Copy Citation:
PDF:
https://aclanthology.org/2012.freeopmt-1.4.pdf