Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems

Oliver Guhr, Anne-Kathrin Schumann, Frank Bahrmann, Hans Joachim Böhme


Abstract
This paper describes the training of a general-purpose German sentiment classification model. Sentiment classification is an important aspect of general text analytics. Furthermore, it plays a vital role in dialogue systems and voice interfaces that depend on the ability of the system to pick up and understand emotional signals from user utterances. The presented study outlines how we have collected a new German sentiment corpus and then combined this corpus with existing resources to train a broad-coverage German sentiment model. The resulting data set contains 5.4 million labelled samples. We have used the data to train both, a simple convolutional and a transformer-based classification model and compared the results achieved on various training configurations. The model and the data set will be published along with this paper.
Anthology ID:
2020.lrec-1.202
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1627–1632
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.202
DOI:
Bibkey:
Cite (ACL):
Oliver Guhr, Anne-Kathrin Schumann, Frank Bahrmann, and Hans Joachim Böhme. 2020. Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1627–1632, Marseille, France. European Language Resources Association.
Cite (Informal):
Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems (Guhr et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.202.pdf