Easy and Efficient Transformer: Scalable Inference Solution For Large NLP Model Gongzheng Li author Yadong Xi author Jingzhen Ding author Duan Wang author Ziyang Luo author Rongsheng Zhang author Bai Liu author Changjie Fan author Xiaoxi Mao author Zeng Zhao author 2022-07 text Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track Anastassia Loukina editor Rashmi Gangadharaiah editor Bonan Min editor Association for Computational Linguistics Hybrid: Seattle, Washington + Online conference publication li-etal-2022-easy 10.18653/v1/2022.naacl-industry.8 https://aclanthology.org/2022.naacl-industry.8/ 2022-07 62 68