Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

Mengyu Bu, Shuhao Gu, Yang Feng


Abstract
The many-to-many multilingual neural machine translation can be regarded as the process of integrating semantic features from the source sentences and linguistic features from the target sentences. To enhance zero-shot translation, models need to share knowledge across languages, which can be achieved through auxiliary tasks for learning a universal representation or cross-lingual mapping. To this end, we propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation. On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features, thus facilitating knowledge transfer while preserving complete information. On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation. Experimental results on multilingual datasets demonstrate significant improvement in zero-shot translation compared to the baseline system, while maintaining performance in supervised translation. Further analysis validates the effectiveness of our method in leveraging both semantic and linguistic features.
Anthology ID:
2024.findings-acl.620
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10410–10423
Language:
URL:
https://aclanthology.org/2024.findings-acl.620
DOI:
Bibkey:
Cite (ACL):
Mengyu Bu, Shuhao Gu, and Yang Feng. 2024. Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features. In Findings of the Association for Computational Linguistics ACL 2024, pages 10410–10423, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features (Bu et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.620.pdf