A New Integrated Open-source Morphological Analyzer for Hungarian

Attila Novák, Borbála Siklósi, Csaba Oravecz


Abstract
The goal of a Hungarian research project has been to create an integrated Hungarian natural language processing framework. This infrastructure includes tools for analyzing Hungarian texts, integrated into a standardized environment. The morphological analyzer is one of the core components of the framework. The goal of this paper is to describe a fast and customizable morphological analyzer and its development framework, which synthesizes and further enriches the morphological knowledge implemented in previous tools existing for Hungarian. In addition, we present the method we applied to add semantic knowledge to the lexical database of the morphology. The method utilizes neural word embedding models and morphological and shallow syntactic knowledge.
Anthology ID:
L16-1209
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1315–1322
Language:
URL:
https://aclanthology.org/L16-1209
DOI:
Bibkey:
Cite (ACL):
Attila Novák, Borbála Siklósi, and Csaba Oravecz. 2016. A New Integrated Open-source Morphological Analyzer for Hungarian. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1315–1322, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
A New Integrated Open-source Morphological Analyzer for Hungarian (Novák et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1209.pdf