Sentence Similarity Learning by Lexical Decomposition and Composition

Zhiguo Wang, Haitao Mi, Abraham Ittycheriah


Abstract
Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each word as a vector, and calculates a semantic matching vector for each word based on all words in the other sentence. Then, each word vector is decomposed into a similar component and a dissimilar component based on the semantic matching vector. After this, a two-channel CNN model is employed to capture features by composing the similar and dissimilar components. Finally, a similarity score is estimated over the composed feature vectors. Experimental results show that our model gets the state-of-the-art performance on the answer sentence selection task, and achieves a comparable result on the paraphrase identification task.
Anthology ID:
C16-1127
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1340–1349
Language:
URL:
https://aclanthology.org/C16-1127
DOI:
Bibkey:
Cite (ACL):
Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence Similarity Learning by Lexical Decomposition and Composition. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1340–1349, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Sentence Similarity Learning by Lexical Decomposition and Composition (Wang et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1127.pdf
Data
TrecQAWikiQA