Harris Papageorgiou

Also published as: Haris Papageorgiou


2024

pdf bib
Leveraging fine-tuned Large Language Models with LoRA for Effective Claim, Claimer, and Claim Object Detection
Sotiris Kotitsas | Panagiotis Kounoudis | Eleni Koutli | Haris Papageorgiou
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Misinformation and disinformation phenomena existed long before the advent of digital technologies. The exponential use of social media platforms, whose information feeds have created the conditions for many to many communication and instant amplification of the news has accelerated the diffusion of inaccurate and misleading information. As a result, the identification of claims have emerged as a pivotal technology for combating the influence of misinformation and disinformation within news media. Most existing work has concentrated on claim analysis at the sentence level, neglecting the crucial exploration of supplementary attributes such as the claimer and the claim object of the claim or confining it by limiting its scope to a predefined list of topics. Furthermore, previous research has been mostly centered around political debates, Wikipedia articles, and COVID-19 related content. By leveraging the advanced capabilities of Large Language Models (LLMs) in Natural Language Understanding (NLU) and text generation, we propose a novel architecture utilizing LLMs finetuned with LoRA to transform the claim, claimer and claim object detection task into a Question Answering (QA) setting. We evaluate our approach in a dataset of 867 scientific news articles of 3 domains (Health, Climate Change, Nutrition) (HCN), which are human annotated with the major claim, the claimer and the object of the major claim. We also evaluate our proposed model in the benchmark dataset of NEWSCLAIMS. Experimental and qualitative results showcase the effectiveness of the proposed approach. We make our dataset publicly available to encourage further research.

2023

pdf bib
Empowering Knowledge Discovery from Scientific Literature: A novel approach to Research Artifact Analysis
Petros Stavropoulos | Ioannis Lyris | Natalia Manola | Ioanna Grypari | Haris Papageorgiou
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)

Knowledge extraction from scientific literature is a major issue, crucial to promoting transparency, reproducibility, and innovation in the research community. In this work, we present a novel approach towards the identification, extraction and analysis of dataset and code/software mentions within scientific literature. We introduce a comprehensive dataset, synthetically generated by ChatGPT and meticulously curated, augmented, and expanded with real snippets of scientific text from full-text publications in Computer Science using a human-in-the-loop process. The dataset contains snippets highlighting mentions of the two research artifact (RA) types: dataset and code/software, along with insightful metadata including their Name, Version, License, URL as well as the intended Usage and Provenance. We also fine-tune a simple Large Language Model (LLM) using Low-Rank Adaptation (LoRA) to transform the Research Artifact Analysis (RAA) into an instruction-based Question Answering (QA) task. Ultimately, we report the improvements in performance on the test set of our dataset when compared to other base LLM models. Our method provides a significant step towards facilitating accurate, effective, and efficient extraction of datasets and software from scientific papers, contributing to the challenges of reproducibility and reusability in scientific research.

2021

pdf bib
Argumentation Mining in Scientific Literature for Sustainable Development
Aris Fergadis | Dimitris Pappas | Antonia Karamolegkou | Haris Papageorgiou
Proceedings of the 8th Workshop on Argument Mining

Science, technology and innovation (STI) policies have evolved in the past decade. We are now progressing towards policies that are more aligned with sustainable development through integrating social, economic and environmental dimensions. In this new policy environment, the need to keep track of innovation from its conception in Science and Research has emerged. Argumentation mining, an interdisciplinary NLP field, gives rise to the required technologies. In this study, we present the first STI-driven multidisciplinary corpus of scientific abstracts annotated for argumentative units (AUs) on the sustainable development goals (SDGs) set by the United Nations (UN). AUs are the sentences conveying the Claim(s) reported in the author’s original research and the Evidence provided for support. We also present a set of strong, BERT-based neural baselines achieving an f1-score of 70.0 for Claim and 62.4 for Evidence identification evaluated with 10-fold cross-validation. To demonstrate the effectiveness of our models, we experiment with different test sets showing comparable performance across various SDG policy domains. Our dataset and models are publicly available for research purposes.

2020

pdf bib
Research & Innovation Activities’ Impact Assessment: The Data4Impact System
Ioanna Grypari | Dimitris Pappas | Natalia Manola | Haris Papageorgiou
Proceedings of the 1st Workshop on Language Technologies for Government and Public Administration (LT4Gov)

Cat. 2 Show-case: We present the Data4Impact (D4I) platform, a novel end-to-end system for evidence-based, timely and accurate monitoring and evaluation of research and innovation (R&I) activities. Using the latest technological advances in Human Language Technology (HLT) and our data-driven methodology, we build a novel set of indicators in order to track funded projects and their impact on science, the economy and the society as a whole, during and after the project life-cycle. We develop our methodology by targeting Health-related EC projects from 2007 to 2019 to produce solutions that meet the needs of stakeholders (mainly policy-makers and research funders). Various D4I text analytics workflows process datasets and their metadata, extract valuable insights and estimate intermediate results and metrics, culminating in a set of robust indicators that the users can interact with through our dashboard, the D4I Monitor (available at monitor.data4impact.eu). Therefore, our approach, which can be generalized to different contexts, is multidimensional (technology, tools, indicators, dashboard) and the resulting system can provide an innovative solution for public administrators in their policy-making needs related to RDI funding allocation.

pdf bib
Protest Event Analysis: A Longitudinal Analysis for Greece
Konstantina Papanikolaou | Haris Papageorgiou
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

The advent of Big Data has shifted social science research towards computational methods. The volume of data that is nowadays available has brought a radical change in traditional approaches due to the cost and effort needed for processing. Knowledge extraction from heterogeneous and ample data is not an easy task to tackle. Thus, interdisciplinary approaches are necessary, combining experts of both social and computer science. This paper aims to present a work in the context of protest analysis, which falls into the scope of Computational Social Science. More specifically, the contribution of this work is to describe a Computational Social Science methodology for Event Analysis. The presented methodology is generic in the sense that it can be applied in every event typology and moreover, it is innovative and suitable for interdisciplinary tasks as it incorporates the human-in-the-loop. Additionally, a case study is presented concerning Protest Analysis in Greece over the last two decades. The conceptual foundation lies mainly upon claims analysis, and newspaper data were used in order to map, document and discuss protests in Greece in a longitudinal perspective.

2018

pdf bib
BioRead: A New Dataset for Biomedical Reading Comprehension
Dimitris Pappas | Ion Androutsopoulos | Haris Papageorgiou
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Tweester at SemEval-2017 Task 4: Fusion of Semantic-Affective and pairwise classification models for sentiment analysis in Twitter
Athanasia Kolovou | Filippos Kokkinos | Aris Fergadis | Pinelopi Papalampidi | Elias Iosif | Nikolaos Malandrakis | Elisavet Palogiannidi | Haris Papageorgiou | Shrikanth Narayanan | Alexandros Potamianos
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we describe our submission to SemEval2017 Task 4: Sentiment Analysis in Twitter. Specifically the proposed system participated both to tweet polarity classification (two-, three- and five class) and tweet quantification (two and five-class) tasks.

pdf bib
Universal Dependencies for Greek
Prokopis Prokopidis | Haris Papageorgiou
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)

2016

pdf bib
SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Maria Pontiki | Dimitris Galanis | Haris Papageorgiou | Ion Androutsopoulos | Suresh Manandhar | Mohammad AL-Smadi | Mahmoud Al-Ayyoub | Yanyan Zhao | Bing Qin | Orphée De Clercq | Véronique Hoste | Marianna Apidianaki | Xavier Tannier | Natalia Loukachevitch | Evgeniy Kotelnikov | Nuria Bel | Salud María Jiménez-Zafra | Gülşen Eryiğit
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
Tweester at SemEval-2016 Task 4: Sentiment Analysis in Twitter Using Semantic-Affective Model Adaptation
Elisavet Palogiannidi | Athanasia Kolovou | Fenia Christopoulou | Filippos Kokkinos | Elias Iosif | Nikolaos Malandrakis | Haris Papageorgiou | Shrikanth Narayanan | Alexandros Potamianos
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
SemEval-2015 Task 12: Aspect Based Sentiment Analysis
Maria Pontiki | Dimitris Galanis | Haris Papageorgiou | Suresh Manandhar | Ion Androutsopoulos
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
META-SHARE: One year after
Stelios Piperidis | Harris Papageorgiou | Christian Spurk | Georg Rehm | Khalid Choukri | Olivier Hamon | Nicoletta Calzolari | Riccardo del Gratta | Bernardo Magnini | Christian Girardi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents META-SHARE (www.meta-share.eu), an open language resource infrastructure, and its usage since its Europe-wide deployment in early 2013. META-SHARE is a network of repositories that store language resources (data, tools and processing services) documented with high-quality metadata, aggregated in central inventories allowing for uniform search and access. META-SHARE was developed by META-NET (www.meta-net.eu) and aims to serve as an important component of a language technology marketplace for researchers, developers, professionals and industrial players, catering for the full development cycle of language technology, from research through to innovative products and services. The observed usage in its initial steps, the steadily increasing number of network nodes, resources, users, queries, views and downloads are all encouraging and considered as supportive of the choices made so far. In tandem, take-up activities like direct linking and processing of datasets by language processing services as well as metadata transformation to RDF are expected to open new avenues for data and resources linking and boost the organic growth of the infrastructure while facilitating language technology deployment by much wider research communities and industrial sectors.

pdf bib
SemEval-2014 Task 4: Aspect Based Sentiment Analysis
Maria Pontiki | Dimitris Galanis | John Pavlopoulos | Harris Papageorgiou | Ion Androutsopoulos | Suresh Manandhar
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Experiments for Dependency Parsing of Greek
Prokopis Prokopidis | Haris Papageorgiou
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages

2012

pdf bib
The META-SHARE Metadata Schema for the Description of Language Resources
Maria Gavrilidou | Penny Labropoulou | Elina Desipri | Stelios Piperidis | Haris Papageorgiou | Monica Monachini | Francesca Frontini | Thierry Declerck | Gil Francopoulo | Victoria Arranz | Valerie Mapelli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborates on the distinction between minimal and maximal versions thereof, briefly presents the integrated environment supporting the LRs description and search and retrieval processes and concludes with work to be done in the future for the improvement of the model.

2006

pdf bib
Multi-domain Multi-lingual Named Entity Recognition: Revisiting & Grounding the resources issue
Voula Giouli | Alexis Konstandinidis | Elina Desypri | Harris Papageorgiou
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The paper reports on the development methodology of a system aimed at multi-domain multi-lingual recognition and classification of names in texts, the focus being on the linguistic resources used for training and testing purposes. The corpus presented here has been collected and annotated in the framework of different projects the critical issue being the development of a final resource that is homogenous, re-usable and adaptable to different domains and languages with a view to robust multi-domain and multi-lingual NERC.

pdf bib
Adding multi-layer semantics to the Greek Dependency Treebank
Harris Papageorgiou | Elina Desipri | Maria Koutsombogera | Kanella Pouli | Prokopis Prokopidis
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we give an overview of the approach adopted to add a layer of semantic information to the Greek Dependency Treebank [GDT]. Our ultimate goal is to come up with a large corpus, reliably annotated with rich semantic structures. To this end, a corpus has been compiled encompassing various data sources and domains. This collection has been preprocessed, annotated and validated on the basis of dependency representation. Taking into account multi-layered annotation schemes designed to provide deeper representations of structure and meaning, we describe the methodology followed as regards the semantic layer, we report on the annotation process and the problems faced and we conclude with comments on future work and exploitation of the resulting resource.

2004

pdf bib
The COST278 Pan-European Broadcast News Database
An Vandecatseye | Jean-Pierre Martens | Joao Neto | Hugo Meinedo | Carmen Garcia-Mateo | Javier Dieguez | France Mihelic | Janez Zibert | Jan Nouza | Petr David | Matus Pleva | Anton Cizmar | Harris Papageorgiou | Christina Alexandris
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf bib
Multi-level XML-based Corpus Annotation
Harris Papageorgiou | Prokopis Prokopidis | Voula Giouli | Iason Demiros | Alexis Konstantinidis | Stelios Piperidis
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2000

pdf bib
Named Entity Recognition in Greek Texts
Iason Demiros | Sotiris Boutsis | Voula Giouli | Maria Liakata | Harris Papageorgiou | Stelios Piperidis
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
A Unified POS Tagging Architecture and its Application to Greek
Harris Papageorgiou | Prokopis Prokopidis | Voula Giouli | Stelios Piperidis
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Automatic Generation of Dictionary Definitions from a Computational Lexicon
Penny Labropoulou | Elena Mantzari | Harris Papageorgiou | Maria Gavrilidou
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Design and Implementation of the Online ILSP Greek Corpus
Nick Hatzigeorgiu | Maria Gavrilidou | Stelios Piperidis | George Carayannis | Anastasia Papakostopoulou | Athanassia Spiliotopoulou | Anna Vacalopoulou | Penny Labropoulou | Elena Mantzari | Harris Papageorgiou | Iason Demiros
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

1994

pdf bib
A Matching Technique in Example-Based Machine Translation
Lambros Cranias | Harris Papageorgiou | Stelios Piperdis
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

pdf bib
Automatic Alignment in Parallel Corpora
Harris Papageorgiou | Lambros Cranias | Stelios Piperidis
32nd Annual Meeting of the Association for Computational Linguistics

Search
Co-authors