2023
pdf
bib
abs
MAD-TSC: A Multilingual Aligned News Dataset for Target-dependent Sentiment Classification
Evan Dufraisse
|
Adrian Popescu
|
Julien Tourille
|
Armelle Brun
|
Jerome Deshayes
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Target-dependent sentiment classification (TSC) enables a fine-grained automatic analysis of sentiments expressed in texts. Sentiment expression varies depending on the domain, and it is necessary to create domain-specific datasets. While socially important, TSC in the news domain remains relatively understudied. We introduce MAD-TSC, a new dataset which differs substantially from existing resources. First, it includes aligned examples in eight languages to facilitate a comparison of performance for individual languages, and a direct comparison of human and machine translation. Second, the dataset is sampled from a diversified parallel news corpus, and is diversified in terms of news sources and geographic spread of entities. Finally, MAD-TSC is more challenging than existing datasets because its examples are more complex. We exemplify the use of MAD-TSC with comprehensive monolingual and multilingual experiments. The latter show that machine translations can successfully replace manual ones, and that performance for all included languages can match that of English by automatically translating test examples.
2022
pdf
bib
abs
Don’t Burst Blindly: For a Better Use of Natural Language Processing to Fight Opinion Bubbles in News Recommendations
Evan Dufraisse
|
Célina Treuillier
|
Armelle Brun
|
Julien Tourille
|
Sylvain Castagnos
|
Adrian Popescu
Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences
Online news consumption plays an important role in shaping the political opinions of citizens. The news is often served by recommendation algorithms, which adapt content to users’ preferences. Such algorithms can lead to political polarization as the societal effects of the recommended content and recommendation design are disregarded. We posit that biases appear, at least in part, due to a weak entanglement between natural language processing and recommender systems, both processes yet at work in the diffusion and personalization of online information. We assume that both diversity and acceptability of recommended content would benefit from such a synergy. We discuss the limitations of current approaches as well as promising leads of opinion-mining integration for the political news recommendation process.
2013
pdf
bib
Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge
Dhouha Bouamor
|
Adrian Popescu
|
Nasredine Semmar
|
Pierre Zweigenbaum
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
pdf
bib
LIPN-CORE: Semantic Text Similarity using n-grams, WordNet, Syntactic Analysis, ESA and Information Retrieval based Features
Davide Buscaldi
|
Joseph Le Roux
|
Jorge J. García Flores
|
Adrian Popescu
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity
2008
pdf
bib
abs
A Conceptual Approach to Web Image Retrieval
Adrian Popescu
|
Gregory Grefenstette
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
People use the Internet to find a wide variety of images. Existing image search engines do not understand the pictures they return. The introduction of semantic layers in information retrieval frameworks may enhance the quality of the results compared to existing systems. One important challenge in the field is to develop architectures that fit the requirements of real-life applications, like the Internet search engines. In this paper, we describe Olive, an image retrieval application that exploits a large scale conceptual hierarchy (extracted from WordNet) to automatically reformulate user queries, search for associated images and present results in an interactive and structured fashion. When searching a concept in the hierarchy, Olive reformulates the query using its deepest subtypes in WordNet. On the answers page, the system displays a selection of related classes and proposes a content based retrieval functionality among the pictures sharing the same linguistic label. In order to validate our approach, we run to series of tests to assess the performances of the application and report the results here. First, two precision evaluations over a panel of concepts from different domains are realized and second, a user test is designed so as to assess the interaction with the system.