Determining Semantic Textual Similarity using Natural Deduction Proofs

Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, Daisuke Bekki


Abstract
Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higher-order automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logic-based systems and that features derived from the proofs are effective for learning textual similarity.
Anthology ID:
D17-1071
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
681–691
Language:
URL:
https://aclanthology.org/D17-1071
DOI:
10.18653/v1/D17-1071
Bibkey:
Cite (ACL):
Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, and Daisuke Bekki. 2017. Determining Semantic Textual Similarity using Natural Deduction Proofs. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 681–691, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Determining Semantic Textual Similarity using Natural Deduction Proofs (Yanaka et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1071.pdf
Video:
 https://vimeo.com/238232412
Code
 mynlp/ccg2lambda
Data
SICK