Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination

Jianxin Liang, Chang Liu, Chongyang Tao, Jiazhan Feng, Dongyan Zhao


Abstract
Although the incorporation of pre-trained language models (PLMs) significantly pushes the research frontier of multi-turn response selection, it brings a new issue of heavy computation costs. To alleviate this problem and make the PLM-based response selection model both effective and efficient, we propose an inference framework together with a post-training strategy that builds upon any pre-trained transformer-based response selection models to accelerate inference by progressively selecting and eliminating unimportant content under the guidance of context-response dual-attention. Specifically, at each transformer layer, we first identify the importance of each word based on context-to-response and response-to-context attention, then select a number of unimportant words to be eliminated following a retention configuration derived from evolutionary search while passing the rest of the representations into deeper layers. To mitigate the training-inference gap posed by content elimination, we introduce a post-training strategy where we use knowledge distillation to force the model with progressively eliminated content to mimic the predictions of the original model with no content elimination. Experiments on three benchmarks indicate that our method can effectively speeds-up SOTA models without much performance degradation and shows a better trade-off between speed and performance than previous methods.
Anthology ID:
2023.findings-acl.422
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6758–6770
Language:
URL:
https://aclanthology.org/2023.findings-acl.422
DOI:
10.18653/v1/2023.findings-acl.422
Bibkey:
Cite (ACL):
Jianxin Liang, Chang Liu, Chongyang Tao, Jiazhan Feng, and Dongyan Zhao. 2023. Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6758–6770, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination (Liang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.422.pdf