Recognizing Textual Entailment in Twitter Using Word Embeddings

Octavia-Maria Şulea


Abstract
In this paper, we investigate the application of machine learning techniques and word embeddings to the task of Recognizing Textual Entailment (RTE) in Social Media. We look at a manually labeled dataset consisting of user generated short texts posted on Twitter (tweets) and related to four recent media events (the Charlie Hebdo shooting, the Ottawa shooting, the Sydney Siege, and the German Wings crash) and test to what extent neural techniques and embeddings are able to distinguish between tweets that entail or contradict each other or that claim unrelated things. We obtain comparable results to the state of the art in a train-test setting, but we show that, due to the noisy aspect of the data, results plummet in an evaluation strategy crafted to better simulate a real-life train-test scenario.
Anthology ID:
W17-5306
Volume:
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Samuel Bowman, Yoav Goldberg, Felix Hill, Angeliki Lazaridou, Omer Levy, Roi Reichart, Anders Søgaard
Venue:
RepEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31–35
Language:
URL:
https://aclanthology.org/W17-5306/
DOI:
10.18653/v1/W17-5306
Bibkey:
Cite (ACL):
Octavia-Maria Şulea. 2017. Recognizing Textual Entailment in Twitter Using Word Embeddings. In Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, pages 31–35, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Recognizing Textual Entailment in Twitter Using Word Embeddings (Şulea, RepEval 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5306.pdf