Feifei Zhai


2023

pdf bib
Multi-Stage Pre-training Enhanced by ChatGPT for Multi-Scenario Multi-Domain Dialogue Summarization
Weixiao Zhou | Gengyao Li | Xianfu Cheng | Xinnian Liang | Junnan Zhu | Feifei Zhai | Zhoujun Li
Findings of the Association for Computational Linguistics: EMNLP 2023

Dialogue summarization involves a wide range of scenarios and domains. However, existing methods generally only apply to specific scenarios or domains. In this study, we propose a new pre-trained model specifically designed for multi-scenario multi-domain dialogue summarization. It adopts a multi-stage pre-training strategy to reduce the gap between the pre-training objective and fine-tuning objective. Specifically, we first conduct domain-aware pre-training using large-scale multi-scenario multi-domain dialogue data to enhance the adaptability of our pre-trained model. Then, we conduct task-oriented pre-training using large-scale multi-scenario multi-domain “dialogue-summary” parallel data annotated by ChatGPT to enhance the dialogue summarization ability of our pre-trained model. Experimental results on three dialogue summarization datasets from different scenarios and domains indicate that our pre-trained model significantly outperforms previous state-of-the-art models in full fine-tuning, zero-shot, and few-shot settings.

2019

pdf bib
A Compact and Language-Sensitive Multilingual Translation Method
Yining Wang | Long Zhou | Jiajun Zhang | Feifei Zhai | Jingfang Xu | Chengqing Zong
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Multilingual neural machine translation (Multi-NMT) with one encoder-decoder model has made remarkable progress due to its simple deployment. However, this multilingual translation paradigm does not make full use of language commonality and parameter sharing between encoder and decoder. Furthermore, this kind of paradigm cannot outperform the individual models trained on bilingual corpus in most cases. In this paper, we propose a compact and language-sensitive method for multilingual translation. To maximize parameter sharing, we first present a universal representor to replace both encoder and decoder models. To make the representor sensitive for specific languages, we further introduce language-sensitive embedding, attention, and discriminator with the ability to enhance model performance. We verify our methods on various translation scenarios, including one-to-many, many-to-many and zero-shot. Extensive experiments demonstrate that our proposed methods remarkably outperform strong standard multilingual translation systems on WMT and IWSLT datasets. Moreover, we find that our model is especially helpful in low-resource and zero-shot translation scenarios.

2018

pdf bib
Improving the Transformer Translation Model with Document-Level Context
Jiacheng Zhang | Huanbo Luan | Maosong Sun | Feifei Zhai | Jingfang Xu | Min Zhang | Yang Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly.

pdf bib
Three Strategies to Improve One-to-Many Multilingual Translation
Yining Wang | Jiajun Zhang | Feifei Zhai | Jingfang Xu | Chengqing Zong
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Due to the benefits of model compactness, multilingual translation (including many-to-one, many-to-many and one-to-many) based on a universal encoder-decoder architecture attracts more and more attention. However, previous studies show that one-to-many translation based on this framework cannot perform on par with the individually trained models. In this work, we introduce three strategies to improve one-to-many multilingual translation by balancing the shared and unique features. Within the architecture of one decoder for all target languages, we first exploit the use of unique initial states for different target languages. Then, we employ language-dependent positional embeddings. Finally and especially, we propose to divide the hidden cells of the decoder into shared and language-dependent ones. The extensive experiments demonstrate that our proposed methods can obtain remarkable improvements over the strong baselines. Moreover, our strategies can achieve comparable or even better performance than the individually trained translation models.

2015

pdf bib
A pilot study towards end-to-end MT training
Feifei Zhai | Liang Huang
Proceedings of Machine Translation Summit XV: Papers

pdf bib
Search-Aware Tuning for Hierarchical Phrase-based Decoding
Feifei Zhai | Liang Huang | Kai Zhao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
RNN-based Derivation Structure Prediction for SMT
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Unsupervised Tree Induction for Tree-based Translation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 1

In current research, most tree-based translation models are built directly from parse trees. In this study, we go in another direction and build a translation model with an unsupervised tree structure derived from a novel non-parametric Bayesian model. In the model, we utilize synchronous tree substitution grammars (STSG) to capture the bilingual mapping between language pairs. To train the model efficiently, we develop a Gibbs sampler with three novel Gibbs operators. The sampler is capable of exploring the infinite space of tree structures by performing local changes on the tree nodes. Experimental results show that the string-to-tree translation system using our Bayesian tree structures significantly outperforms the strong baseline string-to-tree system using parse trees.

2012

pdf bib
Machine Translation by Modeling Predicate-Argument Structure Transformation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of COLING 2012

pdf bib
Tree-based Translation without using Parse Trees
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of COLING 2012

2011

pdf bib
Simple but Effective Approaches to Improving Tree-to-tree Model
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
Jiajun Zhang | Feifei Zhai | Chengqing Zong
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing