2024
pdf
bib
abs
LODinG: Linked Open Data in the Humanities
Jacek Kudera
|
Claudia Bamberg
|
Thomas Burch
|
Folke Gernert
|
Maria Hinzmann
|
Susanne Kabatnik
|
Claudine Moulin
|
Benjamin Raue
|
Achim Rettinger
|
Jörg Röpke
|
Ralf Schenkel
|
Kristin Shi-Kupfer
|
Doris Schirra
|
Christof Schöch
|
Joëlle Weis
Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024
We are presenting LODinG – Linked Open Data in the Humanities (abbreviated from Linked Open Data in den Geisteswissenschaften), a recently launched research initiative exploring the intersection of Linked Open Data (LOD) and a range of areas of work within the Humanities. We focus on effective methods of collecting, modeling, linking, releasing and analyzing machine-readable information relevant to (digital) humanities research in the form of LOD. LODinG combines the sources and methods of digital humanities, general and computational linguistics, digital lexicography, German and Romance philology, translatology, cultural and literary studies, media studies, information science and law to explore and expand the potential of the LOD paradigm for such a diverse and multidisciplinary field. The project’s primary objectives are to improve the methods of extracting, modeling and analyzing multilingual data in the LOD paradigm; to demonstrate the application of the linguistic LOD to various methods and domains within and beyond the humanities; and to develop a modular, cross-domain data model for the humanities.
pdf
bib
abs
FZI-WIM at AVeriTeC Shared Task: Real-World Fact-Checking with Question Answering
Jin Liu
|
Steffen Thoma
|
Achim Rettinger
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
This paper describes the FZI-WIM system at the AVeriTeC shared Task, which aims to assess evidence-based automated fact-checking systems for real-world claims with evidence retrieved from the web. The FZI-WIM system utilizes open-source models to build a reliable fact-checking pipeline via question-answering. With different experimental setups, we show that more questions lead to higher scores in the shared task. Both in question generation and question-answering stages, sampling can be a way to improve the performance of our system. We further analyze the limitations of current open-source models for real-world claim verification. Our code is publicly available https://github.com/jens5588/FZI-WIM-AVERITEC.
2016
pdf
bib
Bilingual Word Embeddings from Parallel and Non-parallel Corpora for Cross-Language Text Classification
Aditya Mogadala
|
Achim Rettinger
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2014
pdf
bib
abs
xLiD-Lexica: Cross-lingual Linked Data Lexica
Lei Zhang
|
Michael Färber
|
Achim Rettinger
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In this paper, we introduce our cross-lingual linked data lexica, called xLiD-Lexica, which are constructed by exploiting the multilingual Wikipedia and linked data resources from Linked Open Data (LOD). We provide the cross-lingual groundings of linked data resources from LOD as RDF data, which can be easily integrated into the LOD data sources. In addition, we build a SPARQL endpoint over our xLiD-Lexica to allow users to easily access them using SPARQL query language. Multilingual and cross-lingual information access can be facilitated by the availability of such lexica, e.g., allowing for an easy mapping of natural language expressions in different languages to linked data resources from LOD. Many tasks in natural language processing, such as natural language generation, cross-lingual entity linking, text annotation and question answering, can benefit from our xLiD-Lexica.
pdf
bib
abs
RECSA: Resource for Evaluating Cross-lingual Semantic Annotation
Achim Rettinger
|
Lei Zhang
|
Daša Berović
|
Danijela Merkler
|
Matea Srebačić
|
Marko Tadić
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In recent years large repositories of structured knowledge (DBpedia, Freebase, YAGO) have become a valuable resource for language technologies, especially for the automatic aggregation of knowledge from textual data. One essential component of language technologies, which leverage such knowledge bases, is the linking of words or phrases in specific text documents with elements from the knowledge base (KB). We call this semantic annotation. In the same time, initiatives like Wikidata try to make those knowledge bases less language dependent in order to allow cross-lingual or language independent knowledge access. This poses a new challenge to semantic annotation tools which typically are language dependent and link documents in one language to a structured knowledge base grounded in the same language. Ultimately, the goal is to construct cross-lingual semantic annotation tools that can link words or phrases in one language to a structured knowledge database in any other language or to a language independent representation. To support this line of research we developed what we believe could serve as a gold standard Resource for Evaluating Cross-lingual Semantic Annotation (RECSA). We compiled a hand-annotated parallel corpus of 300 news articles in three languages with cross-lingual semantic groundings to the English Wikipedia and DBPedia. We hope that this new language resource, which is freely available, will help to establish a standard test set and methodology to comparatively evaluate cross-lingual semantic annotation technologies.
pdf
bib
XLike Project Language Analysis Services
Xavier Carreras
|
Lluís Padró
|
Lei Zhang
|
Achim Rettinger
|
Zhixing Li
|
Esteban García-Cuesta
|
Željko Agić
|
Božo Bekavac
|
Blaz Fortuna
|
Tadej Štajner
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics
pdf
bib
Semantic Annotation, Analysis and Comparison: A Multilingual and Cross-lingual Text Analytics Toolkit
Lei Zhang
|
Achim Rettinger
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics