Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection

Xin Huang, Ashish Khetan, Rene Bidart, Zohar Karnin


Abstract
Transformer-based language models such as BERT (CITATION) have achieved the state-of-the-art performance on various NLP tasks, but are computationally prohibitive. A recent line of works use various heuristics to successively shorten sequence length while transforming tokens through encoders, in tasks such as classification and ranking that require a single token embedding for prediction. We present a novel solution to this problem, called Pyramid-BERT where we replace previously used heuristics with a core-set based token selection method justified by theoretical results. The core-set based token selection technique allows us to avoid expensive pre-training, gives a space-efficient fine tuning, and thus makes it suitable to handle longer sequence lengths. We provide extensive experiments establishing advantages of pyramid BERT over several baselines and existing works on the GLUE benchmarks and Long Range Arena (CITATION) datasets.
Anthology ID:
2022.acl-long.602
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8798–8817
Language:
URL:
https://aclanthology.org/2022.acl-long.602
DOI:
10.18653/v1/2022.acl-long.602
Bibkey:
Cite (ACL):
Xin Huang, Ashish Khetan, Rene Bidart, and Zohar Karnin. 2022. Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8798–8817, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection (Huang et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.602.pdf
Software:
 2022.acl-long.602.software.zip
Data
GLUELRAQNLI