Defying Wikidata: Validation of Terminological Relations in the Web of Data

Patricia Martín-Chozas, Sina Ahmadi, Elena Montiel-Ponsoda


Abstract
In this paper we present an approach to validate terminological data retrieved from open encyclopaedic knowledge bases. This need arises from the enrichment of automatically extracted terms with information from existing resources in theLinguistic Linked Open Data cloud. Specifically, the resource employed for this enrichment is WIKIDATA, since it is one of the biggest knowledge bases freely available within the Semantic Web. During the experiment, we noticed that certain RDF properties in the Knowledge Base did not contain the data they are intended to represent, but a different type of information. In this paper we propose an approach to validate the retrieved data based on four axioms that rely on two linguistic theories: the x-bar theory and the multidimensional theory of terminology. The validation process is supported by a second knowledge base specialised in linguistic data; in this case, CONCEPTNET. In our experiment, we validate terms from the legal domain in four languages: Dutch, English, German and Spanish. The final aim is to generate a set of sound and reliable terminological resources in RDF to contribute to the population of the Linguistic Linked Open Data cloud.
Anthology ID:
2020.lrec-1.694
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5654–5659
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.694
DOI:
Bibkey:
Cite (ACL):
Patricia Martín-Chozas, Sina Ahmadi, and Elena Montiel-Ponsoda. 2020. Defying Wikidata: Validation of Terminological Relations in the Web of Data. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5654–5659, Marseille, France. European Language Resources Association.
Cite (Informal):
Defying Wikidata: Validation of Terminological Relations in the Web of Data (Martín-Chozas et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.694.pdf
Code
 sinaahmadi/LDTerm