Contextualized and Generalized Sentence Representations by Contrastive Self-Supervised Learning: A Case Study on Discourse Relation Analysis

Hirokazu Kiyomaru, Sadao Kurohashi


Abstract
We propose a method to learn contextualized and generalized sentence representations using contrastive self-supervised learning. In the proposed method, a model is given a text consisting of multiple sentences. One sentence is randomly selected as a target sentence. The model is trained to maximize the similarity between the representation of the target sentence with its context and that of the masked target sentence with the same context. Simultaneously, the model minimizes the similarity between the latter representation and the representation of a random sentence with the same context. We apply our method to discourse relation analysis in English and Japanese and show that it outperforms strong baseline methods based on BERT, XLNet, and RoBERTa.
Anthology ID:
2021.naacl-main.442
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5578–5584
Language:
URL:
https://aclanthology.org/2021.naacl-main.442
DOI:
10.18653/v1/2021.naacl-main.442
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.442.pdf
Data
BookCorpus