Semantic Clustering of Pivot Paraphrases

Marianna Apidianaki, Emilia Verzeni, Diana McCarthy


Abstract
Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction approach to unveil hidden sense distinctions in the paraphrase sets. The comparison of the acquired senses to gold data from the Lexical Substitution shared task (McCarthy and Navigli, 2007) demonstrates that sense distinctions exist in the paraphrase sets and highlights the need for a disambiguation step in applications using this resource.
Anthology ID:
L14-1401
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
4270–4275
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/475_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Marianna Apidianaki, Emilia Verzeni, and Diana McCarthy. 2014. Semantic Clustering of Pivot Paraphrases. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 4270–4275, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Semantic Clustering of Pivot Paraphrases (Apidianaki et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/475_Paper.pdf