Spicy Salmon: Converting between 50+ Annotation Formats with Fintan, Pepper, Salt and Powla

Christian Fäth, Christian Chiarcos


Abstract
Heterogeneity of formats, models and annotations has always been a primary hindrance for exploiting the ever increasing amount of existing linguistic resources for real world applications in and beyond NLP. Fintan - the Flexible INtegrated Transformation and Annotation eNgineering platform introduced in 2020 is designed to rapidly convert, combine and manipulate language resources both in and outside the Semantic Web by transforming it into segmented RDF representations which can be processed in parallel on a multithreaded environment and integrating it with ontologies and taxonomies. Fintan has recently been extended with a set of additional modules increasing the amount of supported non-RDF formats and the interoperability with existing non-JAVA conversion tools, and parts of this work are demonstrated in this paper. In particular, we focus on a novel recipe for resource transformation in which Fintan works in tandem with the Pepper toolset to allow computational linguists to transform their data between over 50 linguistic corpus formats with a graphical workflow manager.
Anthology ID:
2022.ldl-1.8
Volume:
Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Thierry Declerck, John P. McCrae, Elena Montiel, Christian Chiarcos, Maxim Ionov
Venue:
LDL
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
61–68
Language:
URL:
https://aclanthology.org/2022.ldl-1.8
DOI:
Bibkey:
Cite (ACL):
Christian Fäth and Christian Chiarcos. 2022. Spicy Salmon: Converting between 50+ Annotation Formats with Fintan, Pepper, Salt and Powla. In Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference, pages 61–68, Marseille, France. European Language Resources Association.
Cite (Informal):
Spicy Salmon: Converting between 50+ Annotation Formats with Fintan, Pepper, Salt and Powla (Fäth & Chiarcos, LDL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ldl-1.8.pdf