Artem Revenko
2022
WiC-TSV-de: German Word-in-Context Target-Sense-Verification Dataset and Cross-Lingual Transfer Analysis
Anna Breit | Artem Revenko | Narayani Blaschke
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Anna Breit | Artem Revenko | Narayani Blaschke
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Target Sense Verification (TSV) describes the binary disambiguation task of deciding whether the intended sense of a target word in a context corresponds to a given target sense. In this paper, we introduce WiC-TSV-de, a multi-domain dataset for German Target Sense Verification. While the training and development sets consist of domain-independent instances only, the test set contains domain-bound subsets, originating from four different domains, being Gastronomy, Medicine, Hunting, and Zoology. The domain-bound subsets incorporate adversarial examples such as in-domain ambiguous target senses and context-mixing (i.e., using the target sense in an out-of-domain context) which contribute to the challenging nature of the presented dataset. WiC-TSV-de allows for the development of sense-inventory-independent disambiguation models that can generalise their knowledge for different domain settings. By combining it with the original English WiC-TSV benchmark, we performed monolingual and cross-lingual analysis, where the evaluated baseline models were not able to solve the dataset to a satisfying degree, leaving a big gap to human performance.
2021
WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context
Anna Breit | Artem Revenko | Kiamehr Rezaee | Mohammad Taher Pilehvar | Jose Camacho-Collados
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Anna Breit | Artem Revenko | Kiamehr Rezaee | Mohammad Taher Pilehvar | Jose Camacho-Collados
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
We present WiC-TSV, a new multi-domain evaluation benchmark for Word Sense Disambiguation. More specifically, we introduce a framework for Target Sense Verification of Words in Context which grounds its uniqueness in the formulation as binary classification task thus being independent of external sense inventories, and the coverage of various domains. This makes the dataset highly flexible for the evaluation of a diverse set of models and systems in and across domains. WiC-TSV provides three different evaluation settings, depending on the input signals provided to the model. We set baseline performance on the dataset using state-of-the-art language models. Experimental results show that even though these models can perform decently on the task, there remains a gap between machine and human performance, especially in out-of-domain settings. WiC-TSV data is available at https://competitions.codalab.org/competitions/23683.
Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6)
Luis Espinosa-Anke | Dagmar Gromann | Thierry Declerck | Anna Breit | Jose Camacho-Collados | Mohammad Taher Pilehvar | Artem Revenko
Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6)
Luis Espinosa-Anke | Dagmar Gromann | Thierry Declerck | Anna Breit | Jose Camacho-Collados | Mohammad Taher Pilehvar | Artem Revenko
Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6)
2020
Orchestrating NLP Services for the Legal Domain
Julian Moreno-Schneider | Georg Rehm | Elena Montiel-Ponsoda | Víctor Rodriguez-Doncel | Artem Revenko | Sotirios Karampatakis | Maria Khvalchik | Christian Sageder | Jorge Gracia | Filippo Maganza
Proceedings of the Twelfth Language Resources and Evaluation Conference
Julian Moreno-Schneider | Georg Rehm | Elena Montiel-Ponsoda | Víctor Rodriguez-Doncel | Artem Revenko | Sotirios Karampatakis | Maria Khvalchik | Christian Sageder | Jorge Gracia | Filippo Maganza
Proceedings of the Twelfth Language Resources and Evaluation Conference
Legal technology is currently receiving a lot of attention from various angles. In this contribution we describe the main technical components of a system that is currently under development in the European innovation project Lynx, which includes partners from industry and research. The key contribution of this paper is a workflow manager that enables the flexible orchestration of workflows based on a portfolio of Natural Language Processing and Content Curation services as well as a Multilingual Legal Knowledge Graph that contains semantic information and meaningful references to legal documents. We also describe different use cases with which we experiment and develop prototypical solutions.
Recent Developments for the Linguistic Linked Open Data Infrastructure
Thierry Declerck | John McCrae | Matthias Hartung | Jorge Gracia | Christian Chiarcos | Elena Montiel | Philipp Cimiano | Artem Revenko | Roser Saurí | Deirdre Lee | Stefania Racioppa | Jamal Nasir | Matthias Orlikowsk | Marta Lanau-Coronas | Christian Fäth | Mariano Rico | Mohammad Fazleh Elahi | Maria Khvalchik | Meritxell Gonzalez | Katharine Cooney
Proceedings of the Twelfth Language Resources and Evaluation Conference
Thierry Declerck | John McCrae | Matthias Hartung | Jorge Gracia | Christian Chiarcos | Elena Montiel | Philipp Cimiano | Artem Revenko | Roser Saurí | Deirdre Lee | Stefania Racioppa | Jamal Nasir | Matthias Orlikowsk | Marta Lanau-Coronas | Christian Fäth | Mariano Rico | Mohammad Fazleh Elahi | Maria Khvalchik | Meritxell Gonzalez | Katharine Cooney
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper we describe the contributions made by the European H2020 project “Prêt-à-LLOD” (‘Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors’) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure. Prêt-à-LLOD aims to develop a new methodology for building data value chains applicable to a wide range of sectors and applications and based around language resources and language technologies that can be integrated by means of semantic technologies. We describe the methods implemented for increasing the number of language data sets in the LLOD. We also present the approach for ensuring interoperability and for porting LLOD data sets and services to other infrastructures, as well as the contribution of the projects to existing standards.
2019
Developing and Orchestrating a Portfolio of Natural Legal Language Processing and Document Curation Services
Georg Rehm | Julián Moreno-Schneider | Jorge Gracia | Artem Revenko | Victor Mireles | Maria Khvalchik | Ilan Kernerman | Andis Lagzdins | Marcis Pinnis | Artus Vasilevskis | Elena Leitner | Jan Milde | Pia Weißenhorn
Proceedings of the Natural Legal Language Processing Workshop 2019
Georg Rehm | Julián Moreno-Schneider | Jorge Gracia | Artem Revenko | Victor Mireles | Maria Khvalchik | Ilan Kernerman | Andis Lagzdins | Marcis Pinnis | Artus Vasilevskis | Elena Leitner | Jan Milde | Pia Weißenhorn
Proceedings of the Natural Legal Language Processing Workshop 2019
We present a portfolio of natural legal language processing and document curation services currently under development in a collaborative European project. First, we give an overview of the project and the different use cases, while, in the main part of the article, we focus upon the 13 different processing services that are being deployed in different prototype applications using a flexible and scalable microservices architecture. Their orchestration is operationalised using a content and document curation workflow manager.
Search
Fix author
Co-authors
- Anna Breit 3
- Jorge Gracia 3
- Maria Khvalchik 3
- Jose Camacho-Collados 2
- Thierry Declerck 2
- Julian Moreno Schneider 2
- Mohammad Taher Pilehvar 2
- Georg Rehm 2
- Narayani Blaschke 1
- Christian Chiarcos 1
- Philipp Cimiano 1
- Katharine Cooney 1
- Mohammad Fazleh Elahi 1
- Luis Espinosa Anke 1
- Christian Fäth 1
- Meritxell Gonzàlez 1
- Dagmar Gromann 1
- Matthias Hartung 1
- Sotirios Karampatakis 1
- Ilan Kernerman 1
- Andis Lagzdiņš 1
- Marta Lanau-Coronas 1
- Deirdre Lee 1
- Elena Leitner 1
- Filippo Maganza 1
- John Philip McCrae 1
- Jan Milde 1
- Victor Mireles 1
- Elena Montiel 1
- Elena Montiel-Ponsoda 1
- Jamal A. Nasir 1
- Matthias Orlikowsk 1
- Mārcis Pinnis 1
- Stefania Racioppa 1
- Kiamehr Rezaee 1
- Mariano Rico 1
- Victor Rodriguez-Doncel 1
- Christian Sageder 1
- Roser Saurí 1
- Artus Vasilevskis 1
- Pia Weißenhorn 1