Multi-way Variational NMT for UGC: Improving Robustness in Zero-shot Scenarios via Mixture Density Networks

José Rosales Núñez, Djamé Seddah, Guillaume Wisniewski


Abstract
This work presents a novel Variational Neural Machine Translation (VNMT) architecture with enhanced robustness properties, which we investigate through a detailed case-study addressing noisy French user-generated content (UGC) translation to English. We show that the proposed model, with results comparable or superior to state-of-the-art VNMT, improves performance over UGC translation in a zero-shot evaluation scenario while keeping optimal translation scores on in-domain test sets. We elaborate on such results by visualizing and explaining how neural learning representations behave when processing UGC noise. In addition, we show that VNMT enforces robustness to the learned embeddings, which can be later used for robust transfer learning approaches.
Anthology ID:
2023.nodalida-1.45
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
447–459
Language:
URL:
https://aclanthology.org/2023.nodalida-1.45
DOI:
Bibkey:
Cite (ACL):
José Rosales Núñez, Djamé Seddah, and Guillaume Wisniewski. 2023. Multi-way Variational NMT for UGC: Improving Robustness in Zero-shot Scenarios via Mixture Density Networks. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 447–459, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Multi-way Variational NMT for UGC: Improving Robustness in Zero-shot Scenarios via Mixture Density Networks (Rosales Núñez et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.45.pdf