Cam-Van Thi Nguyen

2024

Curriculum Learning Meets Directed Acyclic Graph for Multimodal Emotion Recognition
Cam-Van Thi Nguyen | Cao-Bach Nguyen | Duc-Trong Le | Quang-Thuy Ha
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Emotion recognition in conversation (ERC) is a crucial task in natural language processing and affective computing. This paper proposes MultiDAG+CL, a novel approach for Multimodal Emotion Recognition in Conversation (ERC) that employs Directed Acyclic Graph (DAG) to integrate textual, acoustic, and visual features within a unified framework. The model is enhanced by Curriculum Learning (CL) to address challenges related to emotional shifts and data imbalance. Curriculum learning facilitates the learning process by gradually presenting training samples in a meaningful order, thereby improving the model’s performance in handling emotional variations and data imbalance. Experimental results on the IEMOCAP and MELD datasets demonstrate that the MultiDAG+CL models outperform baseline models. We release the code for and experiments: https://github.com/vanntc711/MultiDAG-CL.

pdf bib

A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
Nguyen Van Doan | Dat Tran Nguyen | Cam-Van Thi Nguyen
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

pdf bib

TECO: Improving Multimodal Intent Recognition with Text Enhancement through Commonsense Knowledge Extraction
Quynh-Mai Thi Nguyen | Lan-Nhi Thi Nguyen | Cam-Van Thi Nguyen
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

2023

pdf bib

Self-MI: Efficient Multimodal Fusion via Self-Supervised Multi-Task Learning with Auxiliary Mutual Information Maximization
Cam-Van Thi Nguyen | Ngoc-Hoa Thi Nguyen | Duc-Trong Le | Quang-Thuy Ha
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

pdf bib abs

Conversation Understanding using Relational Temporal Graph Neural Networks with Auxiliary Cross-Modality Interaction
Cam-Van Thi Nguyen | Anh-Tuan Mai | The-Son Le | Hai-Dang Kieu | Duc-Trong Le
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Emotion recognition is a crucial task for human conversation understanding. It becomes more challenging with the notion of multimodal data, e.g., language, voice, and facial expressions. As a typical solution, the global- and the local context information are exploited to predict the emotional label for every single sentence, i.e., utterance, in the dialogue. Specifically, the global representation could be captured via modeling of cross-modal interactions at the conversation level. The local one is often inferred using the temporal information of speakers or emotional shifts, which neglects vital factors at the utterance level. Additionally, most existing approaches take fused features of multiple modalities in an unified input without leveraging modality-specific representations. Motivating from these problems, we propose the Relational Temporal Graph Neural Network with Auxiliary Cross-Modality Interaction (CORECT), an novel neural network framework that effectively captures conversation-level cross-modality interactions and utterance-level temporal dependencies with the modality-specific manner for conversation understanding. Extensive experiments demonstrate the effectiveness of CORECT via its state-of-the-art results on the IEMOCAP and CMU-MOSEI datasets for the multimodal ERC task.

Co-authors

Anh-Tuan Mai 1

Cao-Bach Nguyen 1

Ngoc-Hoa Thi Nguyen 1

Dat Tran Nguyen 1

Quynh-Mai Thi Nguyen 1

Lan-Nhi Thi Nguyen 1

Venues

Fix author