Benamara Farah


pdf bib
Image and Text: Fighting the same Battle? Super Resolution Learning for Imbalanced Text Classification
Romain Meunier | Benamara Farah | Véronique Moriceau | Patricia Stolf
Findings of the Association for Computational Linguistics: EMNLP 2023

In this paper, we propose SRL4NLP, a new approach for data augmentation by drawing an analogy between image and text processing: Super-resolution learning. This method is based on using high-resolution images to overcome the problem of low resolution images. While this technique is a common usage in image processing when images have a low resolution or are too noisy, it has never been used in NLP. We therefore propose the first adaptation of this method for text classification and evaluate its effectiveness on urgency detection from tweets posted in crisis situations, a very challenging task where messages are scarce and highly imbalanced. We show that this strategy is efficient when compared to competitive state-of-the-art data augmentation techniques on several benchmarks datasets in two languages.


pdf bib
Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Nicholas Asher | Julie Hunter | Mathieu Morey | Benamara Farah | Stergos Afantenos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especially when these goals are opposed. The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides full discourse structures for multi-party dialogues. It has other remarkable features that make it an interesting resource for other topics: interleaved threads, creative language, and interactions between linguistic and extra-linguistic contexts.