Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Animesh Nighojkar; John Licato

doi:10.18653/v1/2021.acl-long.552

Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Abstract

If two sentences have the same meaning, it should follow that they are equivalent in their inferential properties, i.e., each sentence should textually entail the other. However, many paraphrase datasets currently in widespread use rely on a sense of paraphrase based on word overlap and syntax. Can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences, and is not over-reliant on lexical and syntactic similarities of a sentence pair? We apply the adversarial paradigm to this question, and introduce a new adversarial method of dataset creation for paraphrase identification: the Adversarial Paraphrasing Task (APT), which asks participants to generate semantically equivalent (in the sense of mutually implicative) but lexically and syntactically disparate paraphrases. These sentence pairs can then be used both to test paraphrase identification models (which get barely random accuracy) and then improve their performance. To accelerate dataset generation, we explore automation of APT using T5, and show that the resulting dataset also improves accuracy. We discuss implications for paraphrase detection and release our dataset in the hope of making paraphrase detection models better able to detect sentence-level meaning equivalence.

Anthology ID:: 2021.acl-long.552
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7106–7116
Language:
URL:: https://aclanthology.org/2021.acl-long.552/
DOI:: 10.18653/v1/2021.acl-long.552
Bibkey:
Cite (ACL):: Animesh Nighojkar and John Licato. 2021. Improving Paraphrase Detection with the Adversarial Paraphrasing Task. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7106–7116, Online. Association for Computational Linguistics.
Cite (Informal):: Improving Paraphrase Detection with the Adversarial Paraphrasing Task (Nighojkar & Licato, ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.acl-long.552.pdf
Video:: https://aclanthology.org/2021.acl-long.552.mp4

PDF Cite Search Video Fix data