AnIta: a powerful morphological analyser for Italian

Fabio Tamburini, Matias Melandri


Abstract
In this paper we present AnIta, a powerful morphological analyser for Italian implemented within the framework of finite-state-automata models. It is provided by a large lexicon containing more than 110,000 lemmas that enable it to cover relevant portions of Italian texts. We describe our design choices for the management of inflectional phenomena as well as some interesting new features to explicitly handle derivational and compositional processes in Italian, namely the wordform segmentation structure and Derivation Graph. Two different evaluation experiments, for testing coverage (Recall) and Precision, are described in detail, comparing the AnIta performances with some other freely available tools to handle Italian morphology. The experiments results show that the AnIta Morphological Analyser obtains the best performances among the tested systems, with Recall = 97.21% and Precision = 98.71%. This tool was a fundamental building block for designing a performant PoS-tagger and Lemmatiser for the Italian language that participated to two EVALITA evaluation campaigns ranking, in both cases, together with the best performing systems.
Anthology ID:
L12-1069
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
941–947
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/213_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Fabio Tamburini and Matias Melandri. 2012. AnIta: a powerful morphological analyser for Italian. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 941–947, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
AnIta: a powerful morphological analyser for Italian (Tamburini & Melandri, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/213_Paper.pdf