Penny Labropoulou

Also published as: P. Labropoulou


2024

pdf bib
Common European Language Data Space
Georg Rehm | Stelios Piperidis | Khalid Choukri | Andrejs Vasiļjevs | Katrin Marheinecke | Victoria Arranz | Aivars Bērziņš | Miltos Deligiannis | Dimitris Galanis | Maria Giagkou | Katerina Gkirtzou | Dimitris Gkoumas | Annika Grützner-Zahn | Athanasia Kolovou | Penny Labropoulou | Andis Lagzdiņš | Elena Leitner | Valérie Mapelli | Hélène Mazo | Simon Ostermann | Stefania Racioppa | Mickaël Rigault | Leon Voukoutis
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The Common European Language Data Space (LDS) is an integral part of the EU data strategy, which aims at developing a single market for data. Its decentralised technical infrastructure and governance scheme are currently being developed by the LDS project, which also has dedicated tasks for proof-of-concept prototypes, handling legal aspects, raising awareness and promoting the LDS through events and social media channels. The LDS is part of a broader vision for establishing all necessary components to develop European large language models.

pdf bib
European Language Grid: One Year after
Georg Rehm | Stelios Piperidis | Dimitris Galanis | Penny Labropoulou | Maria Giagkou | Miltos Deligiannis | Leon Voukoutis | Martin Courtois | Julian Moreno-Schneider | Katrin Marheinecke
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The European Language Grid (ELG) is a cloud platform for the whole European Language Technology community. While the EU project that developed the platform successfully concluded in June 2022, the ELG initiative has continued. This article provides a description of the current state of ELG in terms of user adoption and number of language resources and technologies available in early 2024. It also provides an overview of the various activities with regard to ELG since the end of the project and since the publication of the ELG book, especially the co-authors’ attempt to integrate the ELG platform into various data space initiatives. The article also provides an overview of the Digital Language Equality (DLE) dashboard and the current state of DLE in Europe.

pdf bib
MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations
Dagmar Gromann | Hugo Goncalo Oliveira | Lucia Pitarch | Elena-Simona Apostol | Jordi Bernad | Eliot Bytyçi | Chiara Cantone | Sara Carvalho | Francesca Frontini | Radovan Garabik | Jorge Gracia | Letizia Granata | Fahad Khan | Timotej Knez | Penny Labropoulou | Chaya Liebeskind | Maria Pia Di Buono | Ana Ostroški Anić | Sigita Rackevičienė | Ricardo Rodrigues | Gilles Sérasset | Linas Selmistraitis | Mahammadou Sidibé | Purificação Silvano | Blerina Spahiu | Enriketa Sogutlu | Ranka Stanković | Ciprian-Octavian Truică | Giedre Valunaite Oleskeviciene | Slavko Zitnik | Katerina Zdravkova
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs’ ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.

2022

pdf bib
Computational Morphology with OntoLex-Morph
Christian Chiarcos | Katerina Gkirtzou | Fahad Khan | Penny Labropoulou | Marco Passarotti | Matteo Pellegrini
Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference

This paper describes the current status of the emerging OntoLex module for linguistic morphology. It serves as an update to the previous version of the vocabulary (Klimek et al. 2019). Whereas this earlier model was exclusively focusing on descriptive morphology and focused on applications in lexicography, we now present a novel part and a novel application of the vocabulary to applications in language technology, i.e., the rule-based generation of lexicons, introducing a dynamic component into OntoLex.

pdf bib
Collaborative Metadata Aggregation and Curation in Support of Digital Language Equality Monitoring
Maria Giagkou | Stelios Piperidis | Penny Labropoulou | Miltos Deligiannis | Athanasia Kolovou | Leon Voukoutis
Proceedings of the Workshop Towards Digital Language Equality within the 13th Language Resources and Evaluation Conference

The European Language Equality (ELE) project develops a strategic research, innovation and implementation agenda (SRIA) and a roadmap for achieving full digital language equality in Europe by 2030. Key component of the SRIA development is an accurate estimation of the current standing of languages with respect to their technological readiness. In this paper we present the empirical basis on which such estimation is grounded, its starting point and in particular the automatic and collaborative methods used for extending it. We focus on the collaborative expert activities, the challenges posed, and the solutions adopted. We also briefly present the dashboard application developed for querying and visualising the empirical data as well as monitoring and comparing the evolution of technological support within and across languages.

pdf bib
Categorizing legal features in a metadata-oriented task: defining the conditions of use
Mickaël Rigault | Victoria Arranz | Valérie Mapelli | Penny Labropoulou | Stelios Piperidis
Proceedings of the Workshop on Ethical and Legal Issues in Human Language Technologies and Multilingual De-Identification of Sensitive Data In Language Resources within the 13th Language Resources and Evaluation Conference

2021

pdf bib
European Language Grid: A Joint Platform for the European Language Technology Community
Georg Rehm | Stelios Piperidis | Kalina Bontcheva | Jan Hajic | Victoria Arranz | Andrejs Vasiļjevs | Gerhard Backfried | Jose Manuel Gomez-Perez | Ulrich Germann | Rémi Calizzano | Nils Feldhus | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Julian Moreno-Schneider | Dimitris Galanis | Penny Labropoulou | Miltos Deligiannis | Katerina Gkirtzou | Athanasia Kolovou | Dimitris Gkoumas | Leon Voukoutis | Ian Roberts | Jana Hamrlova | Dusan Varis | Lukas Kacena | Khalid Choukri | Valérie Mapelli | Mickaël Rigault | Julija Melnika | Miro Janosik | Katja Prinz | Andres Garcia-Silva | Cristian Berrio | Ondrej Klejch | Steve Renals
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Europe is a multilingual society, in which dozens of languages are spoken. The only option to enable and to benefit from multilingualism is through Language Technologies (LT), i.e., Natural Language Processing and Speech Technologies. We describe the European Language Grid (ELG), which is targeted to evolve into the primary platform and marketplace for LT in Europe by providing one umbrella platform for the European LT landscape, including research and industry, enabling all stakeholders to upload, share and distribute their services, products and resources. At the end of our EU project, which will establish a legal entity in 2022, the ELG will provide access to approx. 1300 services for all European languages as well as thousands of data sets.

2020

pdf bib
European Language Grid: An Overview
Georg Rehm | Maria Berger | Ela Elsholz | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Stelios Piperidis | Miltos Deligiannis | Dimitris Galanis | Katerina Gkirtzou | Penny Labropoulou | Kalina Bontcheva | David Jones | Ian Roberts | Jan Hajič | Jana Hamrlová | Lukáš Kačena | Khalid Choukri | Victoria Arranz | Andrejs Vasiļjevs | Orians Anvari | Andis Lagzdiņš | Jūlija Meļņika | Gerhard Backfried | Erinç Dikici | Miroslav Janosik | Katja Prinz | Christoph Prinz | Severin Stampler | Dorothea Thomas-Aniola | José Manuel Gómez-Pérez | Andres Garcia Silva | Christian Berrío | Ulrich Germann | Steve Renals | Ondrej Klejch
Proceedings of the Twelfth Language Resources and Evaluation Conference

With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented – by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 national competence centres and the European LT Council for outreach and coordination purposes.

pdf bib
Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid
Penny Labropoulou | Katerina Gkirtzou | Maria Gavriilidou | Miltos Deligiannis | Dimitris Galanis | Stelios Piperidis | Georg Rehm | Maria Berger | Valérie Mapelli | Michael Rigault | Victoria Arranz | Khalid Choukri | Gerhard Backfried | José Manuel Gómez-Pérez | Andres Garcia-Silva
Proceedings of the Twelfth Language Resources and Evaluation Conference

The current scientific and technological landscape is characterised by the increasing availability of data resources and processing tools and services. In this setting, metadata have emerged as a key factor facilitating management, sharing and usage of such digital assets. In this paper we present ELG-SHARE, a rich metadata schema catering for the description of Language Resources and Technologies (processing and generation services and tools, models, corpora, term lists, etc.), as well as related entities (e.g., organizations, projects, supporting documents, etc.). The schema powers the European Language Grid platform that aims to be the primary hub and marketplace for industry-relevant Language Technology in Europe. ELG-SHARE has been based on various metadata schemas, vocabularies, and ontologies, as well as related recommendations and guidelines.

pdf bib
Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability
Georg Rehm | Dimitris Galanis | Penny Labropoulou | Stelios Piperidis | Martin Welß | Ricardo Usbeck | Joachim Köhler | Miltos Deligiannis | Katerina Gkirtzou | Johannes Fischer | Christian Chiarcos | Nils Feldhus | Julian Moreno-Schneider | Florian Kintzel | Elena Montiel | Víctor Rodríguez Doncel | John Philip McCrae | David Laqua | Irina Patricia Theile | Christian Dittmar | Kalina Bontcheva | Ian Roberts | Andrejs Vasiļjevs | Andis Lagzdiņš
Proceedings of the 1st International Workshop on Language Technology Platforms

With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the approach using the five emerging AI/LT platforms AI4EU, ELG, Lynx, QURATOR and SPEAKER.

2018

pdf bib
A Legal Perspective on Training Models for Natural Language Processing
Richard Eckart de Castilho | Giulia Dore | Thomas Margoni | Penny Labropoulou | Iryna Gurevych
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Managing Public Sector Data for Multilingual Applications Development
Stelios Piperidis | Penny Labropoulou | Miltos Deligiannis | Maria Giagkou
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2015

pdf bib
RDF Representation of Licenses for Language Resources
Victor Rodriguez-Doncel | Penny Labropoulou
Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications

2014

pdf bib
Developing a Framework for Describing Relations among Language Resources
Penny Labropoulou | Christopher Cieri | Maria Gavrilidou
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we study relations holding between language resources as implemented in activities concerned with their documentation. We envision the term “language resources” with an inclusive definition covering datasets (corpora, lexica, ontologies, grammars, etc.), tools (including web services, workflows, platforms etc.), related publications and documentation, specifications and guidelines. However, the scope of the paper is limited to relations holding for datasets and tools. The study fosuses on the META-SHARE infrastructure and the Linguistic Data Consortium and takes into account the ISOcat DCR relations. Based on this study, we propose a taxonomy of relations, discuss their semantics and provide specifications for their use in order to cater for semantic interoperability. Issues of granularity, redundancy in codification, naming conventions and semantics of the relations are presented.

2012

pdf bib
The META-SHARE Metadata Schema for the Description of Language Resources
Maria Gavrilidou | Penny Labropoulou | Elina Desipri | Stelios Piperidis | Haris Papageorgiou | Monica Monachini | Francesca Frontini | Thierry Declerck | Gil Francopoulo | Victoria Arranz | Valerie Mapelli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborates on the distinction between minimal and maximal versions thereof, briefly presents the integrated environment supporting the LRs description and search and retrieval processes and concludes with work to be done in the future for the improvement of the model.

2011

pdf bib
A Metadata Schema for the Description of Language Resources (LRs)
Maria Gavrilidou | Penny Labropoulou | Stelios Piperidis | Monica Monachini | Francesca Frontini | Gil Francopoulo | Victoria Arranz | Valérie Mapelli
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

2006

pdf bib
Language Resources Production Models: the Case of the INTERA Multilingual Corpus and Terminology
Maria Gavrilidou | Penny Labropoulou | Stelios Piperidis | Voula Giouli | Nicoletta Calzolari | Monica Monachini | Claudia Soria | Khalid Choukri
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper reports on the multilingual Language Resources (MLRs), i.e. parallel corpora and terminological lexicons for less widely digitally available languages, that have been developed in the INTERA project and the methodology adopted for their production. Special emphasis is given to the reality factors that have influenced the MLRs development approach and their final constitution. Building on the experience gained in the project, a production model has been elaborated, suggesting ways and techniques that can be exploited in order to improve LRs production taking into account realistic issues.

2004

pdf bib
Building Parallel Corpora for eContent Professionals
M. Gavrilidou | P. Labropoulou | E. Desipri | V. Giouli | V. Antonopoulos | S. Piperidis
Proceedings of the Workshop on Multilingual Linguistic Resources

2000

pdf bib
Automatic Generation of Dictionary Definitions from a Computational Lexicon
Penny Labropoulou | Elena Mantzari | Harris Papageorgiou | Maria Gavrilidou
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Design and Implementation of the Online ILSP Greek Corpus
Nick Hatzigeorgiu | Maria Gavrilidou | Stelios Piperidis | George Carayannis | Anastasia Papakostopoulou | Athanassia Spiliotopoulou | Anna Vacalopoulou | Penny Labropoulou | Elena Mantzari | Harris Papageorgiou | Iason Demiros
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

Search
Co-authors
Venues