Modelling Collocations in OntoLex-FrAC

Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică


Abstract
Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications.
Anthology ID:
2022.gwll-1.3
Volume:
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Ilan Kernerman, Simon Krek
Venue:
gwll
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
10–18
Language:
URL:
https://aclanthology.org/2022.gwll-1.3
DOI:
Bibkey:
Cite (ACL):
Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, and Ciprian-Octavian Truică. 2022. Modelling Collocations in OntoLex-FrAC. In Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference, pages 10–18, Marseille, France. European Language Resources Association.
Cite (Informal):
Modelling Collocations in OntoLex-FrAC (Chiarcos et al., gwll 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.gwll-1.3.pdf