INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

Hengchao Shang, Zongyao Li, Daimeng Wei, Jiaxin Guo, Minghan Wang, Xiaoyu Chen, Lizhi Lei, Hao Yang


Abstract
Computer-aided translation (CAT) aims to enhance human translation efficiency and is still important in scenarios where machine translation cannot meet quality requirements. One fundamental task within this field is Word-Level Auto Completion (WLAC). WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence. Previous works either employ word classification models to exploit contextual information from both sides of the target word or directly disregarded the dependencies from the right-side context. Furthermore, the key information, i.e. human typed sequences, is only used as prefix constraints in the decoding module. In this paper, we propose the INarIG (Iterative Non-autoregressive Instruct Generation) model, which constructs the human typed sequence into Instruction Unit and employs iterative decoding with subwords to fully utilize input information given in the task. Our model is more competent in dealing with low-frequency words (core scenario of this task), and achieves state-of-the-art results on the WMT22 and benchmark datasets, with a maximum increase of over 10% prediction accuracy.
Anthology ID:
2023.findings-emnlp.948
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14217–14228
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.948
DOI:
10.18653/v1/2023.findings-emnlp.948
Bibkey:
Cite (ACL):
Hengchao Shang, Zongyao Li, Daimeng Wei, Jiaxin Guo, Minghan Wang, Xiaoyu Chen, Lizhi Lei, and Hao Yang. 2023. INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14217–14228, Singapore. Association for Computational Linguistics.
Cite (Informal):
INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion (Shang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.948.pdf