2022
pdf
bib
abs
The Annohub Web Portal
Frank Abromeit
Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference
We introduce the Annohub web portal, specialized on metadata for annotated language resources like corpora, lexica and linguistic terminologies. The new portal provides easy access to our previously released Annohub Linked Data set, by allowing users to explore the annotation metadata in the web browser. In addition, we added features that will allow users to contribute to Annohub by means of uploading language data, in RDF, CoNNL or XML formats, for annotation scheme and language analysis. The generated metadata is finally available for personal use, or for release in Annohub.
2020
pdf
bib
abs
Annohub – Annotation Metadata for Linked Data Applications
Frank Abromeit
|
Christian Fäth
|
Luis Glaser
Proceedings of the 7th Workshop on Linked Data in Linguistics (LDL-2020)
We introduce a new dataset for the Linguistic Linked Open Data (LLOD) cloud that will provide metadata about annotation and language information harvested from annotated language resources like corpora freely available on the internet. To our knowledge annotation metadata is not provided by any metadata provider, e.g. linghub, datahub or CLARIN so far. On the other hand, language metadata that is found on such portals is rarely provided in machine-readable form, especially as Linked Data. In this paper, we describe the harvesting process, content and structure of the new dataset and its application in the Lin|gu|is|tik portal, a research platform for linguists. Aside from that, we introduce tools for the conversion of XML encoded language resources to the CoNLL format. The generated RDF data as well as the XML-converter application are made public under an open license.
pdf
bib
abs
Annotation Interoperability for the Post-ISOCat Era
Christian Chiarcos
|
Christian Fäth
|
Frank Abromeit
Proceedings of the Twelfth Language Resources and Evaluation Conference
With this paper, we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010, and we describe the low-cost harmonization of post-ISOCat vocabularies by means of modular, linked ontologies: The CLARIN Concept Registry, LexInfo, Universal Parts of Speech, Universal Dependencies and UniMorph are linked with the Ontologies of Linguistic Annotation and through it with ISOCat, the GOLD ontology, the Typological Database Systems ontology and a large number of annotation schemes.
2018
pdf
bib
Universal Morphologies for the Caucasus region
Christian Chiarcos
|
Kathrin Donandt
|
Maxim Ionov
|
Monika Rind-Pawlowski
|
Hasmik Sargsian
|
Jesse Wichers Schreur
|
Frank Abromeit
|
Christian Fäth
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Vanya Dimitrova
|
Christian Fäth
|
Christian Chiarcos
|
Heike Renner-Westermann
|
Frank Abromeit
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
2016
pdf
bib
abs
Lin|gu|is|tik: Building the Linguist’s Pathway to Bibliographies, Libraries, Language Resources and Linked Open Data
Christian Chiarcos
|
Christian Fäth
|
Heike Renner-Westermann
|
Frank Abromeit
|
Vanya Dimitrova
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper introduces a novel research tool for the field of linguistics: The Lin|gu|is|tik web portal provides a virtual library which offers scientific information on every linguistic subject. It comprises selected internet sources and databases as well as catalogues for linguistic literature, and addresses an interdisciplinary audience. The virtual library is the most recent outcome of the Special Subject Collection Linguistics of the German Research Foundation (DFG), and also integrates the knowledge accumulated in the Bibliography of Linguistic Literature. In addition to the portal, we describe long-term goals and prospects with a special focus on ongoing efforts regarding an extension towards integrating language resources and Linguistic Linked Open Data.