Monolingual Social Media Datasets for Detecting Contradiction and Entailment

Piroska Lendvai, Isabelle Augenstein, Kalina Bontcheva, Thierry Declerck


Abstract
Entailment recognition approaches are useful for application domains such as information extraction, question answering or summarisation, for which evidence from multiple sentences needs to be combined. We report on a new 3-way judgement Recognizing Textual Entailment (RTE) resource that originates in the Social Media domain, and explain our semi-automatic creation method for the special purpose of information verification, which draws on manually established rumourous claims reported during crisis events. From about 500 English tweets related to 70 unique claims we compile and evaluate 5.4k RTE pairs, while continue automatizing the workflow to generate similar-sized datasets in other languages.
Anthology ID:
L16-1729
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
4602–4605
Language:
URL:
https://aclanthology.org/L16-1729
DOI:
Bibkey:
Cite (ACL):
Piroska Lendvai, Isabelle Augenstein, Kalina Bontcheva, and Thierry Declerck. 2016. Monolingual Social Media Datasets for Detecting Contradiction and Entailment. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 4602–4605, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Monolingual Social Media Datasets for Detecting Contradiction and Entailment (Lendvai et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1729.pdf