SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual Similarity via Paragraph Vector

Mirela-Stefania Duma, Wolfgang Menzel


Abstract
This paper describes our unsupervised knowledge-free approach to the SemEval-2017 Task 1 Competition. The proposed method makes use of Paragraph Vector for assessing the semantic similarity between pairs of sentences. We experimented with various dimensions of the vector and three state-of-the-art similarity metrics. Given a cross-lingual task, we trained models corresponding to its two languages and combined the models by averaging the similarity scores. The results of our submitted runs are above the median scores for five out of seven test sets by means of Pearson Correlation. Moreover, one of our system runs performed best on the Spanish-English-WMT test set ranking first out of 53 runs submitted in total by all participants.
Anthology ID:
S17-2024
Volume:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
170–174
Language:
URL:
https://aclanthology.org/S17-2024
DOI:
10.18653/v1/S17-2024
Bibkey:
Cite (ACL):
Mirela-Stefania Duma and Wolfgang Menzel. 2017. SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual Similarity via Paragraph Vector. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 170–174, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual Similarity via Paragraph Vector (Duma & Menzel, SemEval 2017)
Copy Citation:
PDF:
https://aclanthology.org/S17-2024.pdf
Data
SNLI