Recognition of Polish Derivational Relations Based on Supervised Learning Scheme

Maciej Piasecki, Radoslaw Ramocki, Marek Maziarz


Abstract
The paper presents construction of \emph{Derywator} -- a language tool for the recognition of Polish derivational relations. It was built on the basis of machine learning in a way following the bootstrapping approach: a limited set of derivational pairs described manually by linguists in plWordNet is used to train \emph{Derivator}. The tool is intended to be applied in semi-automated expansion of plWordNet with new instances of derivational relations. The training process is based on the construction of two transducers working in the opposite directions: one for prefixes and one for suffixes. Internal stem alternations are recognised, recorded in a form of mapping sequences and stored together with transducers. Raw results produced by \emph{Derivator} undergo next corpus-based and morphological filtering. A set of derivational relations defined in plWordNet is presented. Results of tests for different derivational relations are discussed. A problem of the necessary corpus-based semantic filtering is analysed. The presented tool depends to a very little extent on the hand-crafted knowledge for a particular language, namely only a table of possible alternations and morphological filtering rules must be exchanged and it should not take longer than a couple of working days.
Anthology ID:
L12-1555
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
916–922
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/926_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Maciej Piasecki, Radoslaw Ramocki, and Marek Maziarz. 2012. Recognition of Polish Derivational Relations Based on Supervised Learning Scheme. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 916–922, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Recognition of Polish Derivational Relations Based on Supervised Learning Scheme (Piasecki et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/926_Paper.pdf