Lingling Wu
2022
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Tianxiang Sun
|
Xiangyang Liu
|
Wei Zhu
|
Zhichao Geng
|
Lingling Wu
|
Yilong He
|
Yuan Ni
|
Guotong Xie
|
Xuanjing Huang
|
Xipeng Qiu
Findings of the Association for Computational Linguistics: ACL 2022
Early exiting allows instances to exit at different layers according to the estimation of difficulty. Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning. In contrast, learning to exit, or learning to predict instance difficulty is a more appealing way. Though some effort has been devoted to employing such “learn-to-exit” modules, it is still unknown whether and how well the instance difficulty can be learned. As a response, we first conduct experiments on the learnability of instance difficulty, which demonstrates that modern neural models perform poorly on predicting instance difficulty. Based on this observation, we propose a simple-yet-effective Hash-based Early Exiting approach HashEE) that replaces the learn-to-exit modules with hash functions to assign each token to a fixed exiting layer. Different from previous methods, HashEE requires no internal classifiers nor extra parameters, and therefore is more efficient. HashEE can be used in various tasks (including language understanding and generation) and model architectures such as seq2seq models. Experimental results on classification, regression, and generation tasks demonstrate that HashEE can achieve higher performance with fewer FLOPs and inference time compared with previous state-of-the-art early exiting methods.
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu
|
Tianxiang Sun
|
Junliang He
|
Jiawen Wu
|
Lingling Wu
|
Xinyu Zhang
|
Hao Jiang
|
Zhao Cao
|
Xuanjing Huang
|
Xipeng Qiu
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Supersized pre-trained language models have pushed the accuracy of various natural language processing (NLP) tasks to a new state-of-the-art (SOTA). Rather than pursuing the reachless SOTA accuracy, more and more researchers start paying attention to model efficiency and usability. Different from accuracy, the metric for efficiency varies across different studies, making them hard to be fairly compared. To that end, this work presents ELUE (Efficient Language Understanding Evaluation), a standard evaluation, and a public leaderboard for efficient NLP models. ELUE is dedicated to depicting the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement. Along with the benchmark, we also release a strong baseline, ElasticBERT, which allows BERT to exit at any layer in both static and dynamic ways. We demonstrate the ElasticBERT, despite its simplicity, outperforms or performs on par with SOTA compressed and early exiting models. With ElasticBERT, the proposed ELUE has a strong Pareto Frontier and makes a better evaluation for efficient NLP models.
Search
Fix data
Co-authors
- Xuan-Jing Huang (黄萱菁) 2
- Xiangyang Liu 2
- Xipeng Qiu (邱锡鹏) 2
- Tianxiang Sun 2
- Zhao Cao 1
- show all...