Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings

Gijs Wijnholds, Mehrnoosh Sadrzadeh


Abstract
Ellipsis is a natural language phenomenon where part of a sentence is missing and its information must be recovered from its surrounding context, as in “Cats chase dogs and so do foxes.”. Formal semantics has different methods for resolving ellipsis and recovering the missing information, but the problem has not been considered for distributional semantics, where words have vector embeddings and combinations thereof provide embeddings for sentences. In elliptical sentences these combinations go beyond linear as copying of elided information is necessary. In this paper, we develop different models for embedding VP-elliptical sentences. We extend existing verb disambiguation and sentence similarity datasets to ones containing elliptical phrases and evaluate our models on these datasets for a variety of non-linear combinations and their linear counterparts. We compare results of these compositional models to state of the art holistic sentence encoders. Our results show that non-linear addition and a non-linear tensor-based composition outperform the naive non-compositional baselines and the linear models, and that sentence encoders perform well on sentence similarity, but not on verb disambiguation.
Anthology ID:
N19-1023
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
261–271
Language:
URL:
https://aclanthology.org/N19-1023
DOI:
10.18653/v1/N19-1023
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/N19-1023.pdf