2023
pdf
bib
abs
MELODI at SemEval-2023 Task 3: In-domain Pre-training for Low-resource Classification of News Articles
Nicolas Devatine
|
Philippe Muller
|
Chloé Braud
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes our approach to Subtask 1 “News Genre Categorization” of SemEval-2023 Task 3 “Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup”, which aims to determine whether a given news article is an opinion piece, an objective report, or satirical. We fine-tuned the domain-specific language model POLITICS, which was pre-trained on a large-scale dataset of more than 3.6M English political news articles following ideology-driven pre-training objectives. In order to use it in the multilingual setup of the task, we added as a pre-processing step the translation of all documents into English. Our system ranked among the top systems overall in most language, and ranked 1st on the English dataset.
pdf
bib
abs
An Integrated Approach for Political Bias Prediction and Explanation Based on Discursive Structure
Nicolas Devatine
|
Philippe Muller
|
Chloé Braud
Findings of the Association for Computational Linguistics: ACL 2023
One crucial aspect of democracy is fair information sharing. While it is hard to prevent biases in news, they should be identified for better transparency. We propose an approach to automatically characterize biases that takes into account structural differences and that is efficient for long texts. This yields new ways to provide explanations for a textual classifier, going beyond mere lexical cues. We show that: (i) the use of discourse-based structure-aware document representations compare well to local, computationally heavy, or domain-specific models on classification tasks that deal with textual bias (ii) our approach based on different levels of granularity allows for the generation of better explanations of model decisions, both at the lexical and structural level, while addressing the challenge posed by long texts.
2022
pdf
bib
abs
Predicting Political Orientation in News with Latent Discourse Structure to Improve Bias Understanding
Nicolas Devatine
|
Philippe Muller
|
Chloé Braud
Proceedings of the 3rd Workshop on Computational Approaches to Discourse
With the growing number of information sources, the problem of media bias becomes worrying for a democratic society. This paper explores the task of predicting the political orientation of news articles, with a goal of analyzing how bias is expressed. We demonstrate that integrating rhetorical dimensions via latent structures over sub-sentential discourse units allows for large improvements, with a +7.4 points difference between the base LSTM model and its discourse-based version, and +3 points improvement over the previous BERT-based state-of-the-art model. We also argue that this gives a new relevant handle for analyzing political bias in news articles.
pdf
bib
abs
Ré-ordonnancement via programmation dynamique pour l’adaptation cross-lingue d’un analyseur en dépendances (Sentence reordering via dynamic programming for cross-lingual dependency parsing )
Nicolas Devatine
|
Caio Corro
|
François Yvon
Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale
Cet article s’intéresse au transfert cross-lingue d’analyseurs en dépendances et étudie des méthodes pour limiter l’effet potentiellement néfaste pour le transfert de divergences entre l’ordre des mots dans les langues source et cible. Nous montrons comment apprendre et implémenter des stratégies de réordonnancement, qui, utilisées en prétraitement, permettent souvent d’améliorer les performances des analyseurs dans un scénario de transfert « zero-shot ».