On the utility of enhancing BERT syntactic bias with Token Reordering Pretraining

Yassir El Mesbahi, Atif Mahmud, Abbas Ghaddar, Mehdi Rezagholizadeh, Phillippe Langlais, Prasanna Parthasarathi


Abstract
Self-supervised Language Modelling (LM) objectives —like BERT masked LM— have become the default choice for pretraining language models. TOken Reordering (TOR) pretraining objectives, beyond token prediction, have not been extensively studied yet. In this work, we explore challenges that underlie the development and usefulness of such objectives on downstream language tasks. In particular, we design a novel TOR pretraining objective which predicts whether two tokens are adjacent or not given a partial bag-of-tokens input. In addition, we investigate the usefulness of Graph Isomorphism Network (GIN), when placed on top of the BERT encoder, in order to enhance the overall model ability to leverage topological signal from the encoded representations. We compare language understanding abilities of TOR to the one of MLM on word-order sensitive (e.g. Dependency Parsing) and insensitive (e.g. text classification) tasks in both full training and few-shot settings. Our results indicate that TOR is competitive to MLM on the GLUE language understanding benchmark, and slightly superior on syntax-dependent datasets, especially in the few-shot setting.
Anthology ID:
2023.conll-1.12
Volume:
Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)
Month:
December
Year:
2023
Address:
Singapore
Editors:
Jing Jiang, David Reitter, Shumin Deng
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
165–182
Language:
URL:
https://aclanthology.org/2023.conll-1.12
DOI:
10.18653/v1/2023.conll-1.12
Bibkey:
Cite (ACL):
Yassir El Mesbahi, Atif Mahmud, Abbas Ghaddar, Mehdi Rezagholizadeh, Phillippe Langlais, and Prasanna Parthasarathi. 2023. On the utility of enhancing BERT syntactic bias with Token Reordering Pretraining. In Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL), pages 165–182, Singapore. Association for Computational Linguistics.
Cite (Informal):
On the utility of enhancing BERT syntactic bias with Token Reordering Pretraining (El Mesbahi et al., CoNLL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.conll-1.12.pdf