Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German

Maria Sukhareva, Christian Chiarcos


Abstract
In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles.
Anthology ID:
L16-1234
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1471–1480
Language:
URL:
https://aclanthology.org/L16-1234
DOI:
Bibkey:
Cite (ACL):
Maria Sukhareva and Christian Chiarcos. 2016. Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1471–1480, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German (Sukhareva & Chiarcos, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1234.pdf
Data
MULTEXT-East