TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields

Tim vor der Brück, Alexander Mehler


Abstract
We present a morphological tagger for Latin, called TTLab Latin Tagger based on Conditional Random Fields (TLT-CRF) which uses a large Latin lexicon. Beyond Part of Speech (PoS), TLT-CRF tags eight inflectional categories of verbs, adjectives or nouns. It utilizes a statistical model based on CRFs together with a rule interpreter that addresses scenarios of sparse training data. We present results of evaluating TLT-CRF to answer the question what can be learnt following the paradigm of 1st order CRFs in conjunction with a large lexical resource and a rule interpreter. Furthermore, we investigate the contigency of representational features and targeted parts of speech to learn about selective features.
Anthology ID:
L16-1240
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1514–1519
Language:
URL:
https://aclanthology.org/L16-1240
DOI:
Bibkey:
Cite (ACL):
Tim vor der Brück and Alexander Mehler. 2016. TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1514–1519, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields (vor der Brück & Mehler, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1240.pdf