Lin Deng
2024
STSPL-SSC: Semi-Supervised Few-Shot Short Text Clustering with Semantic text similarity Optimized Pseudo-Labels
Wenhua Nie
|
Lin Deng
|
Chang-Bo Liu
|
JialingWei JialingWei
|
Ruitong Han
|
Haoran Zheng
Findings of the Association for Computational Linguistics: ACL 2024
This study introduces the Semantic Textual Similarity Pseudo-Label Semi-Supervised Clustering (STSPL-SSC) framework. The STSPL-SSC framework is designed to tackle the prevalent issue of scarce labeled data by combining a Semantic Textual Similarity Pseudo-Label Generation process with a Robust Contrastive Learning module. The process begins with employing k-means clustering on embeddings for initial pseudo-Label allocation. Then we use a Semantic Text Similarity-enhanced module to supervise the secondary clustering of pseudo-labels using labeled data to better align with the real clustering centers. Subsequently, an Adaptive Optimal Transport (AOT) approach fine-tunes the pseudo-labels. Finally, a Robust Contrastive Learning module is employed to foster the learning of classification and instance-level distinctions, aiding clusters to better separate. Experiments conducted on multiple real-world datasets demonstrate that with just one label per class, clustering performance can be significantly improved, outperforming state-of-the-art models with an increase of 1-6% in both accuracy and normalized mutual information, approaching the results of fully-labeled classification.
UIR-ISC at SemEval-2024 Task 3: Textual Emotion-Cause Pair Extraction in Conversations
Hongyu Guo
|
Xueyao Zhang
|
Yiyang Chen
|
Lin Deng
|
Binyang Li
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The goal of Emotion Cause Pair Extraction (ECPE) is to explore the causes of emotion changes and what causes a certain emotion. This paper proposes a three-step learning approach for the task of Textual Emotion-Cause Pair Extraction in Conversations in SemEval-2024 Task 3, named ECSP. We firstly perform data preprocessing operations on the original dataset to construct negative samples. Secondly, we use a pre-trained model to construct token sequence representations with contextual information to obtain emotion prediction. Thirdly, we regard the textual emotion-cause pair extraction task as a machine reading comprehension task, and fine-tune two pre-trained models, RoBERTa and SpanBERT. Our results have achieved good results in the official rankings, ranking 3rd under the strict match with the Strict F1-score of 15.18%, which further shows that our system has a robust performance.
Search
Co-authors
- Wenhua Nie 1
- Chang-Bo Liu 1
- JialingWei JialingWei 1
- Ruitong Han 1
- Haoran Zheng 1
- show all...