pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Mandar Joshi, Eunsol Choi, Omer Levy, Daniel Weld, Luke Zettlemoyer


Abstract
Reasoning about implied relationships (e.g. paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems. This paper proposes new methods for learning and using embeddings of word pairs that implicitly represent background knowledge about such relationships. Our pairwise embeddings are computed as a compositional function of each word’s representation, which is learned by maximizing the pointwise mutual information (PMI) with the contexts in which the the two words co-occur. We add these representations to the cross-sentence attention layer of existing inference models (e.g. BiDAF for QA, ESIM for NLI), instead of extending or replacing existing word embeddings. Experiments show a gain of 2.7% on the recently released SQuAD 2.0 and 1.3% on MultiNLI. Our representations also aid in better generalization with gains of around 6-7% on adversarial SQuAD datasets, and 8.8% on the adversarial entailment test set by Glockner et al. (2018).
Anthology ID:
N19-1362
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3597–3608
Language:
URL:
https://aclanthology.org/N19-1362
DOI:
10.18653/v1/N19-1362
Bibkey:
Cite (ACL):
Mandar Joshi, Eunsol Choi, Omer Levy, Daniel Weld, and Luke Zettlemoyer. 2019. pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3597–3608, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference (Joshi et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1362.pdf
Code
 mandarjoshi90/pair2vec +  additional community code
Data
MultiNLISNLISQuAD