Shuffled-token Detection for Refining Pre-trained RoBERTa

Subhadarshi Panda, Anjali Agrawal, Jeewon Ha, Benjamin Bloch


Abstract
State-of-the-art transformer models have achieved robust performance on a variety of NLP tasks. Many of these approaches have employed domain agnostic pre-training tasks to train models that yield highly generalized sentence representations that can be fine-tuned for specific downstream tasks. We propose refining a pre-trained NLP model using the objective of detecting shuffled tokens. We use a sequential approach by starting with the pre-trained RoBERTa model and training it using our approach. Applying random shuffling strategy on the word-level, we found that our approach enables the RoBERTa model achieve better performance on 4 out of 7 GLUE tasks. Our results indicate that learning to detect shuffled tokens is a promising approach to learn more coherent sentence representations.
Anthology ID:
2021.naacl-srw.12
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
June
Year:
2021
Address:
Online
Editors:
Esin Durmus, Vivek Gupta, Nelson Liu, Nanyun Peng, Yu Su
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
88–93
Language:
URL:
https://aclanthology.org/2021.naacl-srw.12
DOI:
10.18653/v1/2021.naacl-srw.12
Bibkey:
Cite (ACL):
Subhadarshi Panda, Anjali Agrawal, Jeewon Ha, and Benjamin Bloch. 2021. Shuffled-token Detection for Refining Pre-trained RoBERTa. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 88–93, Online. Association for Computational Linguistics.
Cite (Informal):
Shuffled-token Detection for Refining Pre-trained RoBERTa (Panda et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-srw.12.pdf
Video:
 https://aclanthology.org/2021.naacl-srw.12.mp4