DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification

Anastasios Bairaktaris; Symeon Symeonidis; Avi Arampatzis

doi:10.18653/v1/S19-2202

DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification

Anastasios Bairaktaris, Symeon Symeonidis, Avi Arampatzis

Abstract

This report describes the methods employed by the Democritus University of Thrace (DUTH) team for participating in SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums. Our team dealt only with Subtask A: Question Classification. Our approach was based on shallow natural language processing (NLP) pre-processing techniques to reduce the noise in data, feature selection methods, and supervised machine learning algorithms such as NearestCentroid, Perceptron, and LinearSVC. To determine the essential features, we were aided by exploratory data analysis and visualizations. In order to improve classification accuracy, we developed a customized list of stopwords, retaining some opinion- and fact-denoting common function words which would have been removed by standard stoplisting. Furthermore, we examined the usefulness of part-of-speech (POS) categories for the task; by trying to remove nouns and adjectives, we found some evidence that verbs are a valuable POS category for the opinion question class.

Anthology ID:: S19-2202
Volume:: Proceedings of the 13th International Workshop on Semantic Evaluation
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota, USA
Editors:: Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1155–1159
Language:
URL:: https://aclanthology.org/S19-2202/
DOI:: 10.18653/v1/S19-2202
Bibkey:
Cite (ACL):: Anastasios Bairaktaris, Symeon Symeonidis, and Avi Arampatzis. 2019. DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 1155–1159, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):: DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification (Bairaktaris et al., SemEval 2019)
Copy Citation:
PDF:: https://aclanthology.org/S19-2202.pdf

PDF Cite Search Fix data