Semi-Automatic Construction and Refinement of an Annotated Corpus for a Deep Learning Framework for Emotion Classification

Jiajun Xu, Kyosuke Masuda, Hiromitsu Nishizaki, Fumiyo Fukumoto, Yoshimi Suzuki


Abstract
In the case of using a deep learning (machine learning) framework for emotion classification, one significant difficulty faced is the requirement of building a large, emotion corpus in which each sentence is assigned emotion labels. As a result, there is a high cost in terms of time and money associated with the construction of such a corpus. Therefore, this paper proposes a method of creating a semi-automatically constructed emotion corpus. For the purpose of this study sentences were mined from Twitter using some emotional seed words that were selected from a dictionary in which the emotion words were well-defined. Tweets were retrieved by one emotional seed word, and the retrieved sentences were assigned emotion labels based on the emotion category of the seed word. It was evident from the findings that the deep learning-based emotion classification model could not achieve high levels of accuracy in emotion classification because the semi-automatically constructed corpus had many errors when assigning emotion labels. In this paper, therefore, an approach for improving the quality of the emotion labels by automatically correcting the errors of emotion labels is proposed and tested. The experimental results showed that the proposed method worked well, and the classification accuracy rate was improved to 55.1% from 44.9% on the Twitter emotion classification task.
Anthology ID:
2020.lrec-1.200
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1611–1617
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.200
DOI:
Bibkey:
Cite (ACL):
Jiajun Xu, Kyosuke Masuda, Hiromitsu Nishizaki, Fumiyo Fukumoto, and Yoshimi Suzuki. 2020. Semi-Automatic Construction and Refinement of an Annotated Corpus for a Deep Learning Framework for Emotion Classification. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1611–1617, Marseille, France. European Language Resources Association.
Cite (Informal):
Semi-Automatic Construction and Refinement of an Annotated Corpus for a Deep Learning Framework for Emotion Classification (Xu et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.200.pdf