2024
pdf
bib
abs
Mhm... Yeah? Okay! Evaluating the Naturalness and Communicative Function of Synthesized Feedback Responses in Spoken Dialogue
Carol Figueroa
|
Marcel de Korte
|
Magalie Ochs
|
Gabriel Skantze
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
To create conversational systems with human-like listener behavior, generating short feedback responses (e.g., “mhm”, “ah”, “wow”) appropriate for their context is crucial. These responses convey their communicative function through their lexical form and their prosodic realization. In this paper, we transplant the prosody of feedback responses from human-human U.S. English telephone conversations to a target speaker using two synthesis techniques (TTS and signal processing). Our evaluation focuses on perceived naturalness, contextual appropriateness and preservation of communicative function. Results indicate TTS-generated feedback were perceived as more natural than signal-processing-based feedback, with no significant difference in appropriateness. However, the TTS did not consistently convey the communicative function of the original feedback.
pdf
bib
abs
The Distracted Ear: How Listeners Shape Conversational Dynamics
Auriane Boudin
|
Stéphane Rauzy
|
Roxane Bertrand
|
Magalie Ochs
|
Philippe Blache
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
In the realm of human communication, feedback plays a pivotal role in shaping the dynamics of conversations. This study delves into the multifaceted relationship between listener feedback, narration quality and distraction effects. We present an analysis conducted on the SMYLE corpus, specifically enriched for this study, where 30 dyads of participants engaged in 1) face-to-face storytelling (8.2 hours) followed by 2) a free conversation (7.8 hours). The storytelling task unfolds in two conditions, where a storyteller engages with either a “normal” or a “distracted” listener. Examining the feedback impact on storytellers, we discover a positive correlation between the frequency of specific feedback and the narration quality in normal conditions, providing an encouraging conclusion regarding the enhancement of interaction through specific feedback in distraction-free settings. In contrast, in distracted settings, a negative correlation emerges, suggesting that increased specific feedback may disrupt narration quality, underscoring the complexity of feedback dynamics in human communication. The contribution of this paper is twofold: first presenting a new and highly enriched resource for the analysis of discourse phenomena in controlled and normal conditions; second providing new results on feedback production, its form and its consequence on the discourse quality (with direct applications in human-machine interaction).
2022
pdf
bib
abs
Annotation of Communicative Functions of Short Feedback Tokens in Switchboard
Carol Figueroa
|
Adaeze Adigwe
|
Magalie Ochs
|
Gabriel Skantze
Proceedings of the Thirteenth Language Resources and Evaluation Conference
There has been a lot of work on predicting the timing of feedback in conversational systems. However, there has been less focus on predicting the prosody and lexical form of feedback given their communicative function. Therefore, in this paper we present our preliminary annotations of the communicative functions of 1627 short feedback tokens from the Switchboard corpus and an analysis of their lexical realizations and prosodic characteristics. Since there is no standard scheme for annotating the communicative function of feedback we propose our own annotation scheme. Although our work is ongoing, our preliminary analysis revealed lexical tokens such as “yeah” are ambiguous and therefore lexical forms alone are not indicative of the function. Both the lexical form and prosodic characteristics need to be taken into account in order to predict the communicative function. We also found that feedback functions have distinguishable prosodic characteristics in terms of duration, mean pitch, pitch slope, and pitch range.
pdf
bib
abs
Are You Smiling When I Am Speaking?
Auriane Boudin
|
Roxane Bertrand
|
Magalie Ochs
|
Philippe Blache
|
Stephane Rauzy
Proceedings of the Workshop on Smiling and Laughter across Contexts and the Life-span within the 13th Language Resources and Evaluation Conference
The aim of this study is to investigate conversational feedbacks that contain smiles and laughs. Firstly, we propose a statistical analysis of smiles and laughs used as generic and specific feedbacks in a corpus of French talk-in-interaction. Our results show that smiles of low intensity are preferentially used to produce generic feedbacks while high intensity smiles and laughs are preferentially used to produce specific feedbacks. Secondly, based on a machine learning approach, we propose a hierarchical classification of feedback to automatically predict not only the presence/absence of a smile but, also the type of smiles according to an intensity-scale (low or high).
2020
pdf
bib
abs
Two-level classification for dialogue act recognition in task-oriented dialogues
Philippe Blache
|
Massina Abderrahmane
|
Stéphane Rauzy
|
Magalie Ochs
|
Houda Oufaida
Proceedings of the 28th International Conference on Computational Linguistics
Dialogue act classification becomes a complex task when dealing with fine-grain labels. Many applications require such level of labelling, typically automatic dialogue systems. We present in this paper a 2-level classification technique, distinguishing between generic and specific dialogue acts (DA). This approach makes it possible to benefit from the very good accuracy of generic DA classification at the first level and proposes an efficient approach for specific DA, based on high-level linguistic features. Our results show the interest of involving such features into the classifiers, outperforming all other feature sets, in particular those classically used in DA classification.
pdf
bib
abs
Multimodal Corpus of Bidirectional Conversation of Human-human and Human-robot Interaction during fMRI Scanning
Birgit Rauchbauer
|
Youssef Hmamouche
|
Brigitte Bigi
|
Laurent Prévot
|
Magalie Ochs
|
Thierry Chaminade
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper we present investigation of real-life, bi-directional conversations. We introduce the multimodal corpus derived from these natural conversations alternating between human-human and human-robot interactions. The human-robot interactions were used as a control condition for the social nature of the human-human conversations. The experimental set up consisted of conversations between the participant in a functional magnetic resonance imaging (fMRI) scanner and a human confederate or conversational robot outside the scanner room, connected via bidirectional audio and unidirectional videoconferencing (from the outside to inside the scanner). A cover story provided a framework for natural, real-life conversations about images of an advertisement campaign. During the conversations we collected a multimodal corpus for a comprehensive characterization of bi-directional conversations. In this paper we introduce this multimodal corpus which includes neural data from functional magnetic resonance imaging (fMRI), physiological data (blood flow pulse and respiration), transcribed conversational data, as well as face and eye-tracking recordings. Thus, we present a unique corpus to study human conversations including neural, physiological and behavioral data.
pdf
bib
abs
The Brain-IHM Dataset: a New Resource for Studying the Brain Basis of Human-Human and Human-Machine Conversations
Magalie Ochs
|
Roxane Bertrand
|
Aurélie Goujon
|
Deirdre Bolger
|
Anne-Sophie Dubarry
|
Philippe Blache
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper presents an original dataset of controlled interactions, focusing on the study of feedback items. It consists on recordings of different conversations between a doctor and a patient, played by actors. In this corpus, the patient is mainly a listener and produces different feedbacks, some of them being (voluntary) incongruent. Moreover, these conversations have been re-synthesized in a virtual reality context, in which the patient is played by an artificial agent. The final corpus is made of different movies of human-human conversations plus the same conversations replayed in a human-machine context, resulting in the first human-human/human-machine parallel corpus. The corpus is then enriched with different multimodal annotations at the verbal and non-verbal levels. Moreover, and this is the first dataset of this type, we have designed an experiment during which different participants had to watch the movies and give an evaluation of the interaction. During this task, we recorded participant’s brain signal. The Brain-IHM dataset is then conceived with a triple purpose: 1/ studying feedbacks by comparing congruent vs. incongruent feedbacks 2/ comparing human-human and human-machine production of feedbacks 3/ studying the brain basis of feedback perception.
pdf
bib
abs
BrainPredict: a Tool for Predicting and Visualising Local Brain Activity
Youssef Hmamouche
|
Laurent Prévot
|
Magalie Ochs
|
Thierry Chaminade
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper, we present a tool allowing dynamic prediction and visualization of an individual’s local brain activity during a conversation. The prediction module of this tool is based on classifiers trained using a corpus of human-human and human-robot conversations including fMRI recordings. More precisely, the module takes as input behavioral features computed from raw data, mainly the participant and the interlocutor speech but also the participant’s visual input and eye movements. The visualisation module shows in real-time the dynamics of brain active areas synchronised with the behavioral raw data. In addition, it shows which integrated behavioral features are used to predict the activity in individual brain areas.
2018
pdf
bib
abs
De l’usage réel des emojis à une prédiction de leurs catégories (From Emoji Usage to Emoji-Category Prediction)
Gaël Guibon
|
Magalie Ochs
|
Patrice Bellot
Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN
L’utilisation des emojis dans les messageries sociales n’a eu de cesse d’augmenter ces dernières années. Plusieurs travaux récents ont porté sur la prédiction d’emojis afin d’épargner à l’utillisateur le parcours de librairies d’emojis de plus en plus conséquentes. Nous proposons une méthode permettant de récupérer automatiquement les catégories d’emojis à partir de leur contexte d’utilisation afin d’améliorer la prédiction finale. Pour ce faire nous utilisons des plongements lexicaux en considérant les emojis comme des mots présents dans des tweets. Nous appliquons ensuite un regroupement automatique restreint aux emojis visages afin de vérifier l’adéquation des résultats avec la théorie d’Ekman. L’approche est reproductible et applicable sur tous types d’emojis, ou lorsqu’il est nécessaire de prédire de nombreuses classes.
pdf
bib
A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Magalie Ochs
|
Philippe Blache
|
Grégoire de Montcheuil
|
Jean-Marie Pergandi
|
Jorane Saubesty
|
Daniel Francon
|
Daniel Mestre
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
abs
LIS at SemEval-2018 Task 2: Mixing Word Embeddings and Bag of Features for Multilingual Emoji Prediction
Gaël Guibon
|
Magalie Ochs
|
Patrice Bellot
Proceedings of the 12th International Workshop on Semantic Evaluation
In this paper we present the system submitted to the SemEval2018 task2 : Multilingual Emoji Prediction. Our system approaches both languages as being equal by first; considering word embeddings associated to automatically computed features of different types, then by applying bagging algorithm RandomForest to predict the emoji of a tweet.
2017
pdf
bib
abs
Une plateforme de recommandation automatique d’emojis (An emoji recommandation platform)
Gaël Guibon
|
Magalie Ochs
|
Patrice Bellot
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 3 - Démonstrations
Nous présentons une interface de recommandation d’emojis porteurs de sentiments qui utilise un modèle de prédiction appris sur des messages informels privés. Chacun étant associé à deux scores de polarité prédits. Cette interface permet permet également d’enregistrer les choix de l’utilisateur pour confirmer ou infirmer la recommandation.
2014
pdf
bib
abs
Mining a multimodal corpus for non-verbal behavior sequences conveying attitudes
Mathieu Chollet
|
Magalie Ochs
|
Catherine Pelachaud
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Interpersonal attitudes are expressed by non-verbal behaviors on a variety of different modalities. The perception of these behaviors is influenced by how they are sequenced with other behaviors from the same person and behaviors from other interactants. In this paper, we present a method for extracting and generating sequences of non-verbal signals expressing interpersonal attitudes. These sequences are used as part of a framework for non-verbal expression with Embodied Conversational Agents that considers different features of non-verbal behavior: global behavior tendencies, interpersonal reactions, sequencing of non-verbal signals, and communicative intentions. Our method uses a sequence mining technique on an annotated multimodal corpus to extract sequences characteristic of different attitudes. New sequences of non-verbal signals are generated using a probabilistic model, and evaluated using the previously mined sequences.
pdf
bib
abs
A model to generate adaptive multimodal job interviews with a virtual recruiter
Zoraida Callejas
|
Brian Ravenet
|
Magalie Ochs
|
Catherine Pelachaud
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper presents an adaptive model of multimodal social behavior for embodied conversational agents. The context of this research is the training of youngsters for job interviews in a serious game where the agent plays the role of a virtual recruiter. With the proposed model the agent is able to adapt its social behavior according to the anxiety level of the trainee and a predefined difficulty level of the game. This information is used to select the objective of the system (to challenge or comfort the user), which is achieved by selecting the complexity of the next question posed and the agent’s verbal and non-verbal behavior. We have carried out a perceptive study that shows that the multimodal behavior of an agent implementing our model successfully conveys the expected social attitudes.