A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching

Jihun Choi, Taeuk Kim, Sang-goo Lee


Abstract
We present a latent variable model for predicting the relationship between a pair of text sequences. Unlike previous auto-encoding–based approaches that consider each sequence separately, our proposed framework utilizes both sequences within a single model by generating a sequence that has a given relationship with a source sequence. We further extend the cross-sentence generating framework to facilitate semi-supervised training. We also define novel semantic constraints that lead the decoder network to generate semantically plausible and diverse sequences. We demonstrate the effectiveness of the proposed model from quantitative and qualitative experiments, while achieving state-of-the-art results on semi-supervised natural language inference and paraphrase identification.
Anthology ID:
P19-1469
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4747–4761
Language:
URL:
https://aclanthology.org/P19-1469
DOI:
10.18653/v1/P19-1469
Bibkey:
Cite (ACL):
Jihun Choi, Taeuk Kim, and Sang-goo Lee. 2019. A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4747–4761, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching (Choi et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1469.pdf
Data
SNLI