Building Large-Scale English and Korean Datasets for Aspect-Level Sentiment Analysis in Automotive Domain

Dongmin Hyun, Junsu Cho, Hwanjo Yu


Abstract
We release large-scale datasets of users’ comments in two languages, English and Korean, for aspect-level sentiment analysis in automotive domain. The datasets consist of 58,000+ commentaspect pairs, which are the largest compared to existing datasets. In addition, this work covers new language (i.e., Korean) along with English for aspect-level sentiment analysis. We build the datasets from automotive domain to enable users (e.g., marketers in automotive companies) to analyze the voice of customers on automobiles. We also provide baseline performances for future work by evaluating recent models on the released datasets.
Anthology ID:
2020.coling-main.83
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
961–966
Language:
URL:
https://aclanthology.org/2020.coling-main.83
DOI:
10.18653/v1/2020.coling-main.83
Bibkey:
Cite (ACL):
Dongmin Hyun, Junsu Cho, and Hwanjo Yu. 2020. Building Large-Scale English and Korean Datasets for Aspect-Level Sentiment Analysis in Automotive Domain. In Proceedings of the 28th International Conference on Computational Linguistics, pages 961–966, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Building Large-Scale English and Korean Datasets for Aspect-Level Sentiment Analysis in Automotive Domain (Hyun et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.83.pdf
Code
 dmhyun/alsadata