A Tree Extension for CoNLL-RDF

Christian Chiarcos, Luis Glaser


Abstract
The technological bridges between knowledge graphs and natural language processing are of utmost importance for the future development of language technology. CoNLL-RDF is a technology that provides such a bridge for popular one-word-per-line formats as widely used in NLP (e.g., the CoNLL Shared Tasks), annotation (Universal Dependencies, Unimorph), corpus linguistics (Corpus WorkBench, CWB) and digital lexicography (SketchEngine): Every empty-line separated table (usually a sentence) is parsed into an graph, can be freely manipulated and enriched using W3C-standardized RDF technology, and then be serialized back into in a TSV format, RDF or other formats. An important limitation is that CoNLL-RDF provides native support for word-level annotations only. This does include dependency syntax and semantic role annotations, but neither phrase structures nor text structure. We describe the extension of the CoNLL-RDF technology stack for two vocabulary extensions of CoNLL-TSV, the PTB bracket notation used in earlier CoNLL Shared Tasks and the extension with XML markup elements featured by CWB and SketchEngine. In order to represent the necessary extensions of the CoNLL vocabulary in an adequate fashion, we employ the POWLA vocabulary for representing and navigating in tree structures.
Anthology ID:
2020.lrec-1.885
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
7161–7169
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.885
DOI:
Bibkey:
Cite (ACL):
Christian Chiarcos and Luis Glaser. 2020. A Tree Extension for CoNLL-RDF. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 7161–7169, Marseille, France. European Language Resources Association.
Cite (Informal):
A Tree Extension for CoNLL-RDF (Chiarcos & Glaser, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.885.pdf