UINSUSKA-TiTech at SemEval-2017 Task 3: Exploiting Word Importance Levels for Similarity Features for CQA

Surya Agustian, Hiroya Takamura


Abstract
The majority of core techniques to solve many problems in Community Question Answering (CQA) task rely on similarity computation. This work focuses on similarity between two sentences (or questions in subtask B) based on word embeddings. We exploit words importance levels in sentences or questions for similarity features, for classification and ranking with machine learning. Using only 2 types of similarity metric, our proposed method has shown comparable results with other complex systems. This method on subtask B 2017 dataset is ranked on position 7 out of 13 participants. Evaluation on 2016 dataset is on position 8 of 12, outperforms some complex systems. Further, this finding is explorable and potential to be used as baseline and extensible for many tasks in CQA and other textual similarity based system.
Anthology ID:
S17-2061
Volume:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
370–374
Language:
URL:
https://aclanthology.org/S17-2061
DOI:
10.18653/v1/S17-2061
Bibkey:
Cite (ACL):
Surya Agustian and Hiroya Takamura. 2017. UINSUSKA-TiTech at SemEval-2017 Task 3: Exploiting Word Importance Levels for Similarity Features for CQA. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 370–374, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
UINSUSKA-TiTech at SemEval-2017 Task 3: Exploiting Word Importance Levels for Similarity Features for CQA (Agustian & Takamura, SemEval 2017)
Copy Citation:
PDF:
https://aclanthology.org/S17-2061.pdf