Evaluating Evaluation Measures for Ordinal Classification and Ordinal Quantification

Tetsuya Sakai


Abstract
Ordinal Classification (OC) is an important classification task where the classes are ordinal. For example, an OC task for sentiment analysis could have the following classes: highly positive, positive, neutral, negative, highly negative. Clearly, evaluation measures for an OC task should penalise misclassifications by considering the ordinal nature of the classes. Ordinal Quantification (OQ) is a related task where the gold data is a distribution over ordinal classes, and the system is required to estimate this distribution. Evaluation measures for an OQ task should also take the ordinal nature of the classes into account. However, for both OC and OQ, there are only a small number of known evaluation measures that meet this basic requirement. In the present study, we utilise data from the SemEval and NTCIR communities to clarify the properties of nine evaluation measures in the context of OC tasks, and six measures in the context of OQ tasks.
Anthology ID:
2021.acl-long.214
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2759–2769
Language:
URL:
https://aclanthology.org/2021.acl-long.214
DOI:
10.18653/v1/2021.acl-long.214
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-long.214.pdf