ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs

Yang Bai; Wenqian Zhao; Shuo Yin; Zixiao Wang; Bei Yu

doi:10.18653/v1/2023.emnlp-main.250

ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs

Yang Bai, Wenqian Zhao, Shuo Yin, Zixiao Wang, Bei Yu

Abstract

The training and inference efficiency of ever-larger deep neural networks highly rely on the performance of tensor operators on specific hardware platforms. Therefore, a compilation-based optimization flow with automatic tensor generation and parameter tuning is necessary for efficient model deployment. While compilation-based methods with performance models can provide dynamic and suitable code optimization, they suffer from a large design space exploration with rough measurement accuracy and poor transferability among different hardware platforms. This paper presents ATFormer, a simple yet efficient design with attention-inspired modules to accurately predict the performance of optimized operators by capturing global and long-range dependencies within a complete scheduling space. Compared with state-of-the-arts, ATFormer can predict the optimal implementation of tensor operators to reduce inference time with minimal effort on modern DNN benchmarks. Furthermore, ATFormer with pre-trained parameters can quickly adapt to different workloads and hardware via transfer learning.

Anthology ID:: 2023.emnlp-main.250
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4102–4116
Language:
URL:: https://aclanthology.org/2023.emnlp-main.250/
DOI:: 10.18653/v1/2023.emnlp-main.250
Bibkey:
Cite (ACL):: Yang Bai, Wenqian Zhao, Shuo Yin, Zixiao Wang, and Bei Yu. 2023. ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4102–4116, Singapore. Association for Computational Linguistics.
Cite (Informal):: ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs (Bai et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.250.pdf
Video:: https://aclanthology.org/2023.emnlp-main.250.mp4

PDF Cite Search Video Fix data