MSCFFN: A New FFN with Multi-Space Cross to Accelerate Transformer

Tang Dongge, Qing Yang


Abstract
Transformer models have achieved impressive success in various natural language processing tasks. But it is also limited used in some areas and the heavy computation complexity is one of the main limitations. Many model structures have been proposed to reduce the computation complexity and some are really effective. The previous research can be divided into two categories. One is to use more effective training and inference strategies and the other is focused on how to replace the standard self-attention mechanism with linear attention method. Differently, we revisit the design in Transformer and find that the feed forward network (FFN) is also computationally expensive, especially when the hidden dimension is large. In this paper, we propose a new FFN structure, named MSCFFN, which splits the large matrix space to several small space to reduce the computation complexity and uses the Multi-Space Cross method to ensure the accurate result. To the best of our knowledge, this is the first time to redesign FFN to accelerate Transformers. We experimentally validate the effectiveness of the proposed method on the Long-Range Arena benchmark. And the results show MSCFFN can achieve a faster speed with a similar or even better accuracy.
Anthology ID:
2023.findings-emnlp.1017
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15234–15239
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.1017
DOI:
10.18653/v1/2023.findings-emnlp.1017
Bibkey:
Cite (ACL):
Tang Dongge and Qing Yang. 2023. MSCFFN: A New FFN with Multi-Space Cross to Accelerate Transformer. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 15234–15239, Singapore. Association for Computational Linguistics.
Cite (Informal):
MSCFFN: A New FFN with Multi-Space Cross to Accelerate Transformer (Dongge & Yang, Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.1017.pdf