Hohyun Hwang


2021

pdf bib
Semi-Supervised Learning Based on Auto-generated Lexicon Using XAI in Sentiment Analysis
Hohyun Hwang | Younghoon Lee
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

In this study, we proposed a novel Lexicon-based pseudo-labeling method utilizing explainable AI(XAI) approach. Existing approach have a fundamental limitation in their robustness because poor classifier leads to inaccurate soft-labeling, and it lead to poor classifier repetitively. Meanwhile, we generate the lexicon consists of sentiment word based on the explainability score. Then we calculate the confidence of unlabeled data with lexicon and add them into labeled dataset for the robust pseudo-labeling approach. Our proposed method has three contributions. First, the proposed methodology automatically generates a lexicon based on XAI and performs independent pseudo-labeling, thereby guaranteeing higher performance and robustness compared to the existing one. Second, since lexicon-based pseudo-labeling is performed without re-learning in most of models, time efficiency is considerably increased, and third, the generated high-quality lexicon can be available for sentiment analysis of data from similar domains. The effectiveness and efficiency of our proposed method were verified through quantitative comparison with the existing pseudo-labeling method and qualitative review of the generated lexicon.