Syntax and Semantics in a Treebank for Esperanto

Eckhard Bick


Abstract
In this paper we describe and evaluate syntactic and semantic aspects of Arbobanko, a treebank for the artificial language Esperanto, as well as tools and methods used in the production of the treebank. In addition to classical morphosyntax and dependency structure, the treebank was enriched with a lexical-semantic layer covering named entities, a semantic type ontology for nouns and adjectives and a framenet-inspired semantic classification of verbs. For an under-resourced language, the quality of automatic syntactic and semantic pre-annotation is of obvious importance, and by evaluating the underlying parser and the coverage of its semantic ontologies, we try to answer the question whether the language’s extremely regular morphology and transparent semantic affixes translate into a more regular syntax and higher parsing accuracy. On the linguistic side, the treebank allows us to address and quantify typological issues such as the question of word order, auxiliary constructions, lexical transparency and semantic type ambiguity in Esperanto.
Anthology ID:
2020.lrec-1.630
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5120–5127
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.630
DOI:
Bibkey:
Cite (ACL):
Eckhard Bick. 2020. Syntax and Semantics in a Treebank for Esperanto. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 5120–5127, Marseille, France. European Language Resources Association.
Cite (Informal):
Syntax and Semantics in a Treebank for Esperanto (Bick, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.630.pdf