Incremental Transformer: Efficient Encoder for Incremented Text Over MRC and Conversation Tasks

Weisheng Li, Yuechen Wang, Jiaxin Shi, Wengang Zhou, Qi Tian, Houqiang Li


Abstract
Some encoder inputs such as conversation histories are frequently extended with short additional inputs like new responses. However, to obtain the real-time encoding of the extended input, existing Transformer-based encoders like BERT have to encode the whole extended input again without utilizing the existing encoding of the original input, which may be prohibitively slow for real-time applications. In this paper, we introduce Incremental Transformer, an efficient encoder dedicated for faster encoding of incremented input. It takes only added input as input but attends to cached representations of original input in lower layers for better performance. By treating questions as additional inputs of a passage, Incremental Transformer can also be applied to accelerate MRC tasks. Experimental results show tiny decline in effectiveness but significant speedup against traditional full encoder across various MRC and multi-turn conversational question answering tasks. With the help from simple distillation-like auxiliary losses, Incremental Transformer achieves a speedup of 6.2x, with a mere 2.2 point accuracy reduction in comparison to RoBERTa-Large on SQuADV1.1.
Anthology ID:
2025.coling-main.590
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8819–8829
Language:
URL:
https://aclanthology.org/2025.coling-main.590/
DOI:
Bibkey:
Cite (ACL):
Weisheng Li, Yuechen Wang, Jiaxin Shi, Wengang Zhou, Qi Tian, and Houqiang Li. 2025. Incremental Transformer: Efficient Encoder for Incremented Text Over MRC and Conversation Tasks. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8819–8829, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Incremental Transformer: Efficient Encoder for Incremented Text Over MRC and Conversation Tasks (Li et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.590.pdf