Bokyung Son
2020
Sparse and Decorrelated Representations for Stable Zero-shot NMT
Bokyung Son
|
Sungwon Lyu
Findings of the Association for Computational Linguistics: EMNLP 2020
Using a single encoder and decoder for all directions and training with English-centric data is a popular scheme for multilingual NMT. However, zero-shot translation under this scheme is vulnerable to changes in training conditions, as the model degenerates by decoding non-English texts into English regardless of the target specifier token. We present that enforcing both sparsity and decorrelation on encoder intermediate representations with the SLNI regularizer (Aljundi et al., 2019) efficiently mitigates this problem, without performance loss in supervised directions. Notably, effects of SLNI turns out to be irrelevant to promoting language-invariance in encoder representations.
AttnIO: Knowledge Graph Exploration with In-and-Out Attention Flow for Knowledge-Grounded Dialogue
Jaehun Jung
|
Bokyung Son
|
Sungwon Lyu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Retrieving the proper knowledge relevant to conversational context is an important challenge in dialogue systems, to engage users with more informative response. Several recent works propose to formulate this knowledge selection problem as a path traversal over an external knowledge graph (KG), but show only a limited utilization of KG structure, leaving rooms of improvement in performance. To this effect, we present AttnIO, a new dialog-conditioned path traversal model that makes a full use of rich structural information in KG based on two directions of attention flows. Through the attention flows, AttnIO is not only capable of exploring a broad range of multi-hop knowledge paths, but also learns to flexibly adjust the varying range of plausible nodes and edges to attend depending on the dialog context. Empirical evaluations present a marked performance improvement of AttnIO compared to all baselines in OpenDialKG dataset. Also, we find that our model can be trained to generate an adequate knowledge path even when the paths are not available and only the destination nodes are given as label, making it more applicable to real-world dialogue systems.
Revisiting Modularized Multilingual NMT to Meet Industrial Demands
Sungwon Lyu
|
Bokyung Son
|
Kichang Yang
|
Jaekyoung Bae
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
The complete sharing of parameters for multilingual translation (1-1) has been the mainstream approach in current research. However, degraded performance due to the capacity bottleneck and low maintainability hinders its extensive adoption in industries. In this study, we revisit the multilingual neural machine translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. Through comprehensive experiments, we identify the benefits of multi-way training and demonstrate that the M2 can enjoy these benefits without suffering from the capacity bottleneck. Furthermore, the interlingual space of the M2 allows convenient modification of the model. By leveraging trained modules, we find that incrementally added modules exhibit better performance than singly trained models. The zero-shot performance of the added modules is even comparable to supervised models. Our findings suggest that the M2 can be a competent candidate for multilingual translation in industries.
Search