Converting a Database of Complex German Word Formation for Linked Data

Petra Steiner


Abstract
This work combines two lexical resources with morphological information on German word formation, CELEX for German and the latest release of GermaNet, for extracting and building complex word structures. This yields a database of over 100,000 German wordtrees. A definition for sequential morphological analyses leads to a Ontolex-Lemon type model. By using GermaNet sense information, the data can be linked to other semantic resources. An alignment to the CIDOC Conceptual Reference Model (CIDOC-CRM) is also provided. The scripts for the data generation are publicly available on GitHub.
Anthology ID:
2022.gwll-1.8
Volume:
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Ilan Kernerman, Simon Krek
Venue:
gwll
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
52–59
Language:
URL:
https://aclanthology.org/2022.gwll-1.8
DOI:
Bibkey:
Cite (ACL):
Petra Steiner. 2022. Converting a Database of Complex German Word Formation for Linked Data. In Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference, pages 52–59, Marseille, France. European Language Resources Association.
Cite (Informal):
Converting a Database of Complex German Word Formation for Linked Data (Steiner, gwll 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.gwll-1.8.pdf
Data
CELEX