Hongfei Yu


pdf bib
Factorized Transformer for Multi-Domain Neural Machine Translation
Yongchao Deng | Hongfei Yu | Heng Yu | Xiangyu Duan | Weihua Luo
Findings of the Association for Computational Linguistics: EMNLP 2020

Multi-Domain Neural Machine Translation (NMT) aims at building a single system that performs well on a range of target domains. However, along with the extreme diversity of cross-domain wording and phrasing style, the imperfections of training data distribution and the inherent defects of the current sequential learning process all contribute to making the task of multi-domain NMT very challenging. To mitigate these problems, we propose the Factorized Transformer, which consists of an in-depth factorization of the parameters of an NMT model, namely Transformer in this paper, into two categories: domain-shared ones that encode common cross-domain knowledge and domain-specific ones that are private for each constituent domain. We experiment with various designs of our model and conduct extensive validations on English to French open multi-domain dataset. Our approach achieves state-of-the-art performance and opens up new perspectives for multi-domain and open-domain applications.


pdf bib
Contrastive Attention Mechanism for Abstractive Sentence Summarization
Xiangyu Duan | Hongfei Yu | Mingming Yin | Min Zhang | Weihua Luo | Yue Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose a contrastive attention mechanism to extend the sequence-to-sequence framework for abstractive sentence summarization task, which aims to generate a brief summary of a given source sentence. The proposed contrastive attention mechanism accommodates two categories of attention: one is the conventional attention that attends to relevant parts of the source sentence, the other is the opponent attention that attends to irrelevant or less relevant parts of the source sentence. Both attentions are trained in an opposite way so that the contribution from the conventional attention is encouraged and the contribution from the opponent attention is discouraged through a novel softmax and softmin functionality. Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the abstractive sentence summarization task. We release the code at https://github.com/travel-go/ Abstractive-Text-Summarization.