Li Linlin


2023

pdf bib
Adder Encoder for Pre-trained Language Model
Ding Jianbang | Zhang Suiyun | Li Linlin
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“BERT, a pre-trained language model entirely based on attention, has proven to be highly per-formant for many natural language understanding tasks. However, pre-trained language mod-els (PLMs) are often computationally expensive and can hardly be implemented with limitedresources. To reduce energy burden, we introduce adder operations into the Transformer en-coder and propose a novel AdderBERT with powerful representation capability. Moreover, weadopt mapping-based distillation to further improve its energy efficiency with an assured perfor-mance. Empirical results demonstrate that AddderBERT6 achieves highly competitive perfor-mance against that of its teacher BERTBASE on the GLUE benchmark while obtaining a 4.9xreduction in energy consumption.”