A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain

Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima, Hajime Nagahara


Abstract
We annotate 35,000 SNS posts with both the writer’s subjective sentiment polarity labels and the reader’s objective ones to construct a Japanese sentiment analysis dataset. Our dataset includes intensity labels (none, weak, medium, and strong) for each of the eight basic emotions by Plutchik (joy, sadness, anticipation, surprise, anger, fear, disgust, and trust) as well as sentiment polarity labels (strong positive, positive, neutral, negative, and strong negative). Previous studies on emotion analysis have studied the analysis of basic emotions and sentiment polarity independently. In other words, there are few corpora that are annotated with both basic emotions and sentiment polarity. Our dataset is the first large-scale corpus to annotate both of these emotion labels, and from both the writer’s and reader’s perspectives. In this paper, we analyze the relationship between basic emotion intensity and sentiment polarity on our dataset and report the results of benchmarking sentiment polarity classification.
Anthology ID:
2022.lrec-1.759
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
7022–7028
Language:
URL:
https://aclanthology.org/2022.lrec-1.759
DOI:
Bibkey:
Cite (ACL):
Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima, and Hajime Nagahara. 2022. A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7022–7028, Marseille, France. European Language Resources Association.
Cite (Informal):
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain (Suzuki et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.759.pdf
Code
 ids-cv/wrime
Data
IMDb Movie ReviewsSST