Qun Liu


2021

pdf bib
Multilingual Speech Translation with Unified Transformer: Huawei Noah’s Ark Lab at IWSLT 2021
Xingshan Zeng | Liangyou Li | Qun Liu
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

This paper describes the system submitted to the IWSLT 2021 Multilingual Speech Translation (MultiST) task from Huawei Noah’s Ark Lab. We use a unified transformer architecture for our MultiST model, so that the data from different modalities (i.e., speech and text) and different tasks (i.e., Speech Recognition, Machine Translation, and Speech Translation) can be exploited to enhance the model’s ability. Specifically, speech and text inputs are firstly fed to different feature extractors to extract acoustic and textual features, respectively. Then, these features are processed by a shared encoder–decoder architecture. We apply several training techniques to improve the performance, including multi-task learning, task-level curriculum learning, data augmentation, etc. Our final system achieves significantly better results than bilingual baselines on supervised language pairs and yields reasonable results on zero-shot language pairs.

pdf bib
Proceedings of the Second Workshop on Automatic Simultaneous Translation
Hua Wu | Colin Cherry | Liang Huang | Zhongjun He | Qun Liu | Maha Elbayad | Mark Liberman | Haifeng Wang | Mingbo Ma | Ruiqing Zhang
Proceedings of the Second Workshop on Automatic Simultaneous Translation

pdf bib
A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering
Zhihong Shao | Lifeng Shang | Qun Liu | Minlie Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Weakly supervised question answering usually has only the final answers as supervision signals while the correct solutions to derive the answers are not provided. This setting gives rise to the spurious solution problem: there may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance (e.g., producing wrong solutions or answers). For example, for discrete reasoning tasks as on DROP, there may exist many equations to derive a numeric answer, and typically only one of them is correct. Previous learning methods mostly filter out spurious solutions with heuristics or using model confidence, but do not explicitly exploit the semantic correlations between a question and its solution. In this paper, to alleviate the spurious solution problem, we propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions. Extensive experiments on four question answering datasets show that our method significantly outperforms previous learning methods in terms of task performance and is more effective in training models to produce correct solutions.

pdf bib
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai | Wei Zhang | Lu Hou | Lifeng Shang | Jin Jin | Xin Jiang | Qun Liu | Michael Lyu | Irwin King
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The rapid development of large pre-trained language models has greatly increased the demand for model compression techniques, among which quantization is a popular solution. In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization. We find that a binary BERT is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape. Therefore, we propose ternary weight splitting, which initializes BinaryBERT by equivalently splitting from a half-sized ternary network. The binary model thus inherits the good performance of the ternary one, and can be further enhanced by fine-tuning the new architecture after splitting. Empirical results show that our BinaryBERT has only a slight performance drop compared with the full-precision model while being 24x smaller, achieving the state-of-the-art compression results on the GLUE and SQuAD benchmarks. Code will be released.

pdf bib
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
Yichun Yin | Cheng Chen | Lifeng Shang | Xin Jiang | Xiao Chen | Qun Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Pre-trained language models (PLMs) have achieved great success in natural language processing. Most of PLMs follow the default setting of architecture hyper-parameters (e.g., the hidden dimension is a quarter of the intermediate dimension in feed-forward sub-networks) in BERT. Few studies have been conducted to explore the design of architecture hyper-parameters in BERT, especially for the more efficient PLMs with tiny sizes, which are essential for practical deployment on resource-constrained devices. In this paper, we adopt the one-shot Neural Architecture Search (NAS) to automatically search architecture hyper-parameters. Specifically, we carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints. We name our method AutoTinyBERT and evaluate its effectiveness on the GLUE and SQuAD benchmarks. The extensive experiments show that our method outperforms both the SOTA search-based baseline (NAS-BERT) and the SOTA distillation-based methods (such as DistilBERT, TinyBERT, MiniLM, and MobileBERT). In addition, based on the obtained architectures, we propose a more efficient development method that is even faster than the development of a single PLM. The source code and models will be publicly available upon publication.

pdf bib
TGEA: An Error-Annotated Dataset and Benchmark Tasks for TextGeneration from Pretrained Language Models
Jie He | Bo Peng | Yi Liao | Qun Liu | Deyi Xiong
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In order to deeply understand the capability of pretrained language models in text generation and conduct a diagnostic evaluation, we propose TGEA, an error-annotated dataset with multiple benchmark tasks for text generation from pretrained language models (PLMs). We use carefully selected prompt words to guide GPT-2 to generate candidate sentences, from which we select 47K for error annotation. Crowdsourced workers manually check each of these sentences and detect 12k erroneous sentences. We create an error taxonomy to cover 24 types of errors occurring in these erroneous sentences according to the nature of errors with respect to linguistics and knowledge (e.g., common sense). For each erroneous span in PLM-generated sentences, we also detect another span that is closely associated with it. Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error. Apart from the fully annotated dataset, we also present a detailed description of the data collection procedure, statistics and analysis of the dataset. This is the first dataset with comprehensive annotations for PLM-generated texts, which facilitates the diagnostic evaluation of PLM-based text generation. Furthermore, we use TGEA as a benchmark dataset and propose a series of automatic diagnosis tasks, including error detection, error type classification, associated span detection, error rationale generation, to further promote future study on the automatic error detection and correction on texts generated by pretrained language models.

pdf bib
GhostBERT: Generate More Features with Cheap Operations for BERT
Zhiqi Huang | Lu Hou | Lifeng Shang | Xin Jiang | Xiao Chen | Qun Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Transformer-based pre-trained language models like BERT, though powerful in many tasks, are expensive in both memory and computation, due to their large number of parameters. Previous works show that some parameters in these models can be pruned away without severe accuracy drop. However, these redundant features contribute to a comprehensive understanding of the training data and removing them weakens the model’s representation ability. In this paper, we propose GhostBERT, which generates more features with very cheap operations from the remaining features. In this way, GhostBERT has similar memory and computational cost as the pruned model, but enjoys much larger representation power. The proposed ghost module can also be applied to unpruned BERT models to enhance their performance with negligible additional parameters and computation. Empirical results on the GLUE benchmark on three backbone models (i.e., BERT, RoBERTa and ELECTRA) verify the efficacy of our proposed method.

pdf bib
Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning
Chenglei Si | Zhengyan Zhang | Fanchao Qi | Zhiyuan Liu | Yasheng Wang | Qun Liu | Maosong Sun
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
HyKnow: End-to-End Task-Oriented Dialog Modeling with Hybrid Knowledge Management
Silin Gao | Ryuichi Takanobu | Wei Peng | Qun Liu | Minlie Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer
Xingshan Zeng | Liangyou Li | Qun Liu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Two Parents, One Child: Dual Transfer for Low-Resource Neural Machine Translation
Meng Zhang | Liangyou Li | Qun Liu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
HyperText: Endowing FastText with Hyperbolic Geometry
Yudong Zhu | Di Zhou | Jinghui Xiao | Xin Jiang | Xiao Chen | Qun Liu
Findings of the Association for Computational Linguistics: EMNLP 2020

Natural language data exhibit tree-like hierarchical structures such as the hypernym-hyponym hierarchy in WordNet. FastText, as the state-of-the-art text classifier based on shallow neural network in Euclidean space, may not represent such hierarchies precisely with limited representation capacity. Considering that hyperbolic space is naturally suitable for modelling tree-like hierarchical data, we propose a new model named HyperText for efficient text classification by endowing FastText with hyperbolic geometry. Empirically, we show that HyperText outperforms FastText on a range of text classification tasks with much reduced parameters.

pdf bib
BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models
Bin He | Di Zhou | Jinghui Xiao | Xin Jiang | Qun Liu | Nicholas Jing Yuan | Tong Xu
Findings of the Association for Computational Linguistics: EMNLP 2020

Complex node interactions are common in knowledge graphs (KGs), and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs. Traditional knowledge representation learning (KRL) methods usually treat a single triple as a training unit, neglecting the usage of graph contextualized knowledge. To utilize these unexploited graph-level knowledge, we propose an approach to model subgraphs in a medical KG. Then, the learned knowledge is integrated with a pre-trained language model to do the knowledge generalization. Experimental results demonstrate that our model achieves the state-of-the-art performance on several medical NLP tasks, and the improvement above MedERNIE indicates that graph contextualized knowledge is beneficial.

pdf bib
The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation
Jie He | Tao Wang | Deyi Xiong | Qun Liu
Findings of the Association for Computational Linguistics: EMNLP 2020

Does neural machine translation yield translations that are congenial with common sense? In this paper, we present a test suite to evaluate the commonsense reasoning capability of neural machine translation. The test suite consists of three test sets, covering lexical and contextless/contextual syntactic ambiguity that requires commonsense knowledge to resolve. We manually create 1,200 triples, each of which contain a source sentence and two contrastive translations, involving 7 different common sense types. Language models pretrained on large-scale corpora, such as BERT, GPT-2, achieve a commonsense reasoning accuracy of lower than 72% on target translations of this test suite. We conduct extensive experiments on the test suite to evaluate commonsense reasoning in neural machine translation and investigate factors that have impact on this capability. Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy ( 6 60.1%) and reasoning consistency (6 31%). We will release our test suite as a machine translation commonsense reasoning testbed to promote future work in this direction.

pdf bib
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao | Yichun Yin | Lifeng Shang | Xin Jiang | Xiao Chen | Linlin Li | Fang Wang | Qun Liu
Findings of the Association for Computational Linguistics: EMNLP 2020

Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently execute them on resource-restricted devices. To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large “teacher” BERT can be effectively transferred to a small “student” TinyBERT. Then, we introduce a new two-stage learning framework for TinyBERT, which performs Transformer distillation at both the pre-training and task-specific learning stages. This framework ensures that TinyBERT can capture the general-domain as well as the task-specific knowledge in BERT. TinyBERT4 with 4 layers is empirically effective and achieves more than 96.8% the performance of its teacher BERT-Base on GLUE benchmark, while being 7.5x smaller and 9.4x faster on inference. TinyBERT4 is also significantly better than 4-layer state-of-the-art baselines on BERT distillation, with only ~28% parameters and ~31% inference time of them. Moreover, TinyBERT6 with 6 layers performs on-par with its teacher BERT-Base.

pdf bib
Proceedings of the Second International Workshop of Discourse Processing
Qun Liu | Deyi Xiong | Shili Ge | Xiaojun Zhang
Proceedings of the Second International Workshop of Discourse Processing

pdf bib
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Yi Liao | Xin Jiang | Qun Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Masked language model and autoregressive language model are two types of language models. While pretrained masked language models such as BERT overwhelm the line of natural language understanding (NLU) tasks, autoregressive language models such as GPT are especially capable in natural language generation (NLG). In this paper, we propose a probabilistic masking scheme for the masked language model, which we call probabilistically masked language model (PMLM). We implement a specific PMLM with a uniform prior distribution on the masking ratio named u-PMLM. We prove that u-PMLM is equivalent to an autoregressive permutated language model. One main advantage of the model is that it supports text generation in arbitrary order with surprisingly good quality, which could potentially enable new applications over traditional unidirectional generation. Besides, the pretrained u-PMLM also outperforms BERT on a bunch of downstream NLU tasks.

pdf bib
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT
Zhiyong Wu | Yun Chen | Ben Kao | Qun Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

By introducing a small set of additional parameters, a probe learns to solve specific linguistic tasks (e.g., dependency parsing) in a supervised manner using feature representations (e.g., contextualized embeddings). The effectiveness of such probing tasks is taken as evidence that the pre-trained model encodes linguistic knowledge. However, this approach of evaluating a language model is undermined by the uncertainty of the amount of knowledge that is learned by the probe itself. Complementary to those works, we propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT). Our method does not require direct supervision from the probing tasks, nor do we introduce additional parameters to the probing process. Our experiments on BERT show that syntactic trees recovered from BERT using our method are significantly better than linguistically-uninformed baselines. We further feed the empirically induced dependency structures into a downstream sentiment classification task and find its improvement compatible with or even superior to a human-designed dependency schema.

pdf bib
Word-level Textual Adversarial Attacking as Combinatorial Optimization
Yuan Zang | Fanchao Qi | Chenghao Yang | Zhiyuan Liu | Meng Zhang | Qun Liu | Maosong Sun
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Adversarial attacks are carried out to reveal the vulnerability of deep neural networks. Textual adversarial attacking is challenging because text is discrete and a small perturbation can bring significant change to the original input. Word-level attacking, which can be regarded as a combinatorial optimization problem, is a well-studied class of textual attack methods. However, existing word-level attack models are far from perfect, largely because unsuitable search space reduction methods and inefficient optimization algorithms are employed. In this paper, we propose a novel attack model, which incorporates the sememe-based word substitution method and particle swarm optimization-based search algorithm to solve the two problems separately. We conduct exhaustive experiments to evaluate our attack model by attacking BiLSTM and BERT on three benchmark datasets. Experimental results demonstrate that our model consistently achieves much higher attack success rates and crafts more high-quality adversarial examples as compared to baseline methods. Also, further experiments show our model has higher transferability and can bring more robustness enhancement to victim models by adversarial training. All the code and data of this paper can be obtained on https://github.com/thunlp/SememePSO-Attack.

pdf bib
Huawei’s Submissions to the WMT20 Biomedical Translation Task
Wei Peng | Jianfeng Liu | Minghan Wang | Liangyou Li | Xupeng Meng | Hao Yang | Qun Liu
Proceedings of the Fifth Conference on Machine Translation

This paper describes Huawei’s submissions to the WMT20 biomedical translation shared task. Apart from experimenting with finetuning on domain-specific bitexts, we explore effects of in-domain dictionaries on enhancing cross-domain neural machine translation performance. We utilize a transfer learning strategy through pre-trained machine translation models and extensive scope of engineering endeavors. Four of our ten submissions achieve state-of-the-art performance according to the official automatic evaluation results, namely translation directions on English<->French, English->German and English->Italian.

pdf bib
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang | Lu Hou | Yichun Yin | Lifeng Shang | Xiao Chen | Xin Jiang | Qun Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Transformer-based pre-training models like BERT have achieved remarkable performance in many natural language processing tasks. However, these models are both computation and memory expensive, hindering their deployment to resource-constrained devices. In this work, we propose TernaryBERT, which ternarizes the weights in a fine-tuned BERT model. Specifically, we use both approximation-based and loss-aware ternarization methods and empirically investigate the ternarization granularity of different parts of BERT. Moreover, to reduce the accuracy degradation caused by lower capacity of low bits, we leverage the knowledge distillation technique in the training process. Experiments on the GLUE benchmark and SQuAD show that our proposed TernaryBERT outperforms the other BERT quantization methods, and even achieves comparable performance as the full-precision model while being 14.9x smaller.

pdf bib
Accurate Word Alignment Induction from Neural Machine Translation
Yun Chen | Yang Liu | Guanhua Chen | Xin Jiang | Qun Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Despite its original goal to jointly learn to align and translate, prior researches suggest that Transformer captures poor word alignments through its attention mechanism. In this paper, we show that attention weights do capture accurate word alignments and propose two novel word alignment induction methods Shift-Att and Shift-AET. The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input rather than the decoder output as in previous work. Shift-Att is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change. Shift-AET extracts alignments from an additional alignment module which is tightly integrated into Transformer and trained in isolation with supervision from symmetrized Shift-Att alignments. Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines and Shift-AET significantly outperforms GIZA++ by 1.4-4.8 AER points.

pdf bib
Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers
Yimeng Wu | Peyman Passban | Mehdi Rezagholizadeh | Qun Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

With the growth of computing power neural machine translation (NMT) models also grow accordingly and become better. However, they also become harder to deploy on edge devices due to memory constraints. To cope with this problem, a common practice is to distill knowledge from a large and accurately-trained teacher network (T) into a compact student network (S). Although knowledge distillation (KD) is useful in most cases, our study shows that existing KD techniques might not be suitable enough for deep NMT engines, so we propose a novel alternative. In our model, besides matching T and S predictions we have a combinatorial mechanism to inject layer-level supervision from T to S. In this paper, we target low-resource settings and evaluate our translation engines for Portuguese→English, Turkish→English, and English→German directions. Students trained using our technique have 50% fewer parameters and can still deliver comparable results to those of 12-layer teachers.

pdf bib
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Qun Liu | David Schlangen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

pdf bib
A General Framework for Adaptation of Neural Machine Translation to Simultaneous Translation
Yun Chen | Liangyou Li | Xin Jiang | Xiao Chen | Qun Liu
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Despite the success of neural machine translation (NMT), simultaneous neural machine translation (SNMT), the task of translating in real time before a full sentence has been observed, remains challenging due to the syntactic structure difference and simultaneity requirements. In this paper, we propose a general framework for adapting neural machine translation to translate simultaneously. Our framework contains two parts: prefix translation that utilizes a consecutive NMT model to translate source prefixes and a stopping criterion that determines when to stop the prefix translation. Experiments on three translation corpora and two language pairs show the efficacy of the proposed framework on balancing the quality and latency in adapting NMT to perform simultaneous translation.

2019

pdf bib
Improving Domain Adaptation Translation with Domain Invariant and Specific Information
Shuhao Gu | Yang Feng | Qun Liu
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In domain adaptation for neural machine translation, translation performance can benefit from separating features into domain-specific features and common features. In this paper, we propose a method to explicitly model the two kinds of information in the encoder-decoder framework so as to exploit out-of-domain data in in-domain training. In our method, we maintain a private encoder and a private decoder for each domain which are used to model domain-specific information. In the meantime, we introduce a common encoder and a common decoder shared by all the domains which can only have domain-independent information flow through. Besides, we add a discriminator to the shared encoder and employ adversarial training for the whole model to reinforce the performance of information separation and machine translation simultaneously. Experiment results show that our method can outperform competitive baselines greatly on multiple data sets.

pdf bib
Bilingual-GAN: A Step Towards Parallel Text Generation
Ahmad Rashid | Alan Do-Omri | Md. Akmal Haidar | Qun Liu | Mehdi Rezagholizadeh
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation

Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and unsupervised machine translation respectively. Leveraging the two domains, we propose an adversarial latent space based model capable of generating parallel sentences in two languages concurrently and translating bidirectionally. The bilingual generation goal is achieved by sampling from the latent space that is shared between both languages. First two denoising autoencoders are trained, with shared encoders and back-translation to enforce a shared latent state between the two languages. The decoder is shared for the two translation directions. Next, a GAN is trained to generate synthetic ‘code’ mimicking the languages’ shared latent space. This code is then fed into the decoder to generate text in either language. We perform our experiments on Europarl and Multi30k datasets, on the English-French language pair, and document our performance using both supervised and unsupervised machine translation.

pdf bib
Huawei’s NMT Systems for the WMT 2019 Biomedical Translation Task
Wei Peng | Jianfeng Liu | Liangyou Li | Qun Liu
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

This paper describes Huawei’s neural machine translation systems for the WMT 2019 biomedical translation shared task. We trained and fine-tuned our systems on a combination of out-of-domain and in-domain parallel corpora for six translation directions covering English–Chinese, English–French and English–German language pairs. Our submitted systems achieve the best BLEU scores on English–French and English–German language pairs according to the official evaluation results. In the English–Chinese translation task, our systems are in the second place. The enhanced performance is attributed to more in-domain training and more sophisticated models developed. Development of translation models and transfer learning (or domain adaptation) methods has significantly contributed to the progress of the task.

pdf bib
ERNIE: Enhanced Language Representation with Informative Entities
Zhengyan Zhang | Xu Han | Zhiyuan Liu | Xin Jiang | Maosong Sun | Qun Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The code and datasets will be available in the future.

pdf bib
Decomposable Neural Paraphrase Generation
Zichao Li | Xin Jiang | Lifeng Shang | Qun Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level. This paper presents Decomposable Neural Paraphrase Generator (DNPG), a Transformer-based model that can learn and generate paraphrases of a sentence at different levels of granularity in a disentangled way. Specifically, the model is composed of multiple encoders and decoders with different structures, each of which corresponds to a specific granularity. The empirical study shows that the decomposition mechanism of DNPG makes paraphrase generation more interpretable and controllable. Based on DNPG, we further develop an unsupervised domain adaptation method for paraphrase generation. Experimental results show that the proposed model achieves competitive in-domain performance compared to state-of-the-art neural models, and significantly better performance when adapting to a new domain.

pdf bib
Bridging the Gap between Training and Inference for Neural Machine Translation
Wen Zhang | Yang Feng | Fandong Meng | Di You | Qun Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT’14 English->German translation tasks demonstrate that our approach can achieve significant improvements on multiple datasets.

pdf bib
Modeling Semantic Compositionality with Sememe Knowledge
Fanchao Qi | Junjie Huang | Chenghao Yang | Zhiyuan Liu | Xiao Chen | Qun Liu | Maosong Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Semantic compositionality (SC) refers to the phenomenon that the meaning of a complex linguistic unit can be composed of the meanings of its constituents. Most related works focus on using complicated compositionality functions to model SC while few works consider external knowledge in models. In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment. Furthermore, we make the first attempt to incorporate sememe knowledge into SC models, and employ the sememe-incorporated models in learning representations of multiword expressions, a typical task of SC. In experiments, we implement our models by incorporating knowledge from a famous sememe knowledge base HowNet and perform both intrinsic and extrinsic evaluations. Experimental results show that our models achieve significant performance boost as compared to the baseline methods without considering sememe knowledge. We further conduct quantitative analysis and case studies to demonstrate the effectiveness of applying sememe knowledge in modeling SC.All the code and data of this paper can be obtained on https://github.com/thunlp/Sememe-SC.

2018

pdf bib
Refining Source Representations with Relation Networks for Neural Machine Translation
Wen Zhang | Jiawei Hu | Yang Feng | Qun Liu
Proceedings of the 27th International Conference on Computational Linguistics

Although neural machine translation with the encoder-decoder framework has achieved great success recently, it still suffers drawbacks of forgetting distant information, which is an inherent disadvantage of recurrent neural network structure, and disregarding relationship between source words during encoding step. Whereas in practice, the former information and relationship are often useful in current step. We target on solving these problems and thus introduce relation networks to learn better representations of the source. The relation networks are able to facilitate memorization capability of recurrent neural network via associating source words with each other, this would also help retain their relationships. Then the source representations and all the relations are fed into the attention component together while decoding, with the main encoder-decoder framework unchanged. Experiments on several datasets show that our method can improve the translation performance significantly over the conventional encoder-decoder model and even outperform the approach involving supervised syntactic knowledge.

pdf bib
Tailoring Neural Architectures for Translating from Morphologically Rich Languages
Peyman Passban | Andy Way | Qun Liu
Proceedings of the 27th International Conference on Computational Linguistics

A morphologically complex word (MCW) is a hierarchical constituent with meaning-preserving subunits, so word-based models which rely on surface forms might not be powerful enough to translate such structures. When translating from morphologically rich languages (MRLs), a source word could be mapped to several words or even a full sentence on the target side, which means an MCW should not be treated as an atomic unit. In order to provide better translations for MRLs, we boost the existing neural machine translation (NMT) architecture with a double- channel encoder and a double-attentive decoder. The main goal targeted in this research is to provide richer information on the encoder side and redesign the decoder accordingly to benefit from such information. Our experimental results demonstrate that we could achieve our goal as the proposed model outperforms existing subword- and character-based architectures and showed significant improvements on translating from German, Russian, and Turkish into English.

pdf bib
Improving Character-Based Decoding Using Target-Side Morphological Information for Neural Machine Translation
Peyman Passban | Qun Liu | Andy Way
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Recently, neural machine translation (NMT) has emerged as a powerful alternative to conventional statistical approaches. However, its performance drops considerably in the presence of morphologically rich languages (MRLs). Neural engines usually fail to tackle the large vocabulary and high out-of-vocabulary (OOV) word rate of MRLs. Therefore, it is not suitable to exploit existing word-based models to translate this set of languages. In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information. In our architecture, an additional morphology table is plugged into the model. Each time the decoder samples from a target vocabulary, the table sends auxiliary signals from the most relevant affixes in order to enrich the decoder’s current state and constrain it to provide better predictions. We evaluated our model to translate English into German, Russian, and Turkish as three MRLs and observed significant improvements.

pdf bib
Knowledge Diffusion for Neural Dialogue Generation
Shuman Liu | Hongshen Chen | Zhaochun Ren | Yang Feng | Qun Liu | Dawei Yin
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

End-to-end neural dialogue generation has shown promising results recently, but it does not employ knowledge to guide the generation and hence tends to generate short, general, and meaningless responses. In this paper, we propose a neural knowledge diffusion (NKD) model to introduce knowledge into dialogue generation. This method can not only match the relevant facts for the input utterance but diffuse them to similar entities. With the help of facts matching and entity diffusion, the neural dialogue generation is augmented with the ability of convergent and divergent thinking over the knowledge base. Our empirical study on a real-world dataset prove that our model is capable of generating meaningful, diverse and natural responses for both factoid-questions and knowledge grounded chi-chats. The experiment results also show that our model outperforms competitive baseline models significantly.

pdf bib
Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data
Koel Dutta Chowdhury | Mohammed Hasanuzzaman | Qun Liu
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

In this paper, we investigate the effectiveness of training a multimodal neural machine translation (MNMT) system with image features for a low-resource language pair, Hindi and English, using synthetic data. A three-way parallel corpus which contains bilingual texts and corresponding images is required to train a MNMT system with image features. However, such a corpus is not available for low resource language pairs. To address this, we developed both a synthetic training dataset and a manually curated development/test dataset for Hindi based on an existing English-image parallel corpus. We used these datasets to build our image description translation system by adopting state-of-the-art MNMT models. Our results show that it is possible to train a MNMT system for low-resource language pairs through the use of synthetic data and that such a system can benefit from image features.

pdf bib
E2E NLG Challenge Submission: Towards Controllable Generation of Diverse Natural Language
Henry Elder | Sebastian Gehrmann | Alexander O’Connor | Qun Liu
Proceedings of the 11th International Conference on Natural Language Generation

In natural language generation (NLG), the task is to generate utterances from a more abstract input, such as structured data. An added challenge is to generate utterances that contain an accurate representation of the input, while reflecting the fluency and variety of human-generated text. In this paper, we report experiments with NLG models that can be used in task oriented dialogue systems. We explore the use of additional input to the model to encourage diversity and control of outputs. While our submission does not rank highly using automated metrics, qualitative investigation of generated utterances suggests the use of additional information in neural network NLG systems to be a promising research direction.

pdf bib
Learning to Jointly Translate and Predict Dropped Pronouns with a Shared Reconstruction Mechanism
Longyue Wang | Zhaopeng Tu | Andy Way | Qun Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations. Recently, Wang et al. (2018) proposed a novel reconstruction-based approach to alleviating dropped pronoun (DP) translation problems for neural machine translation models. In this work, we improve the original model from two perspectives. First, we employ a shared reconstructor to better exploit encoder and decoder representations. Second, we jointly learn to translate and predict DPs in an end-to-end manner, to avoid the errors propagated from an external DP prediction model. Experimental results show that our approach significantly improves both translation performance and DP prediction accuracy.

pdf bib
Speeding Up Neural Machine Translation Decoding by Cube Pruning
Wen Zhang | Liang Huang | Yang Feng | Lei Shen | Qun Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Although neural machine translation has achieved promising results, it suffers from slow translation speed. The direct consequence is that a trade-off has to be made between translation quality and speed, thus its performance can not come into full play. We apply cube pruning, a popular technique to speed up dynamic programming, into neural machine translation to speed up the translation. To construct the equivalence class, similar target hidden states are combined, leading to less RNN expansion operations on the target side and less softmax operations over the large target vocabulary. The experiments show that, at the same or even better translation quality, our method can translate faster compared with naive beam search by 3.3x on GPUs and 3.5x on CPUs.

2017

pdf bib
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking
Alfredo Maldonado | Lifeng Han | Erwan Moreau | Ashjan Alsulaimani | Koel Dutta Chowdhury | Carl Vogel | Qun Liu
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)

A description of a system for identifying Verbal Multi-Word Expressions (VMWEs) in running text is presented. The system mainly exploits universal syntactic dependency features through a Conditional Random Fields (CRF) sequence model. The system competed in the Closed Track at the PARSEME VMWE Shared Task 2017, ranking 2nd place in most languages on full VMWE-based evaluation and 1st in three languages on token-based evaluation. In addition, this paper presents an option to re-rank the 10 best CRF-predicted sequences via semantic vectors, boosting its scores above other systems in the competition. We also show that all systems in the competition would struggle to beat a simple lookup baseline system and argue for a more purpose-specific evaluation scheme.

pdf bib
Findings of the 2017 Conference on Machine Translation (WMT17)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Shujian Huang | Matthias Huck | Philipp Koehn | Qun Liu | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Raphael Rubino | Lucia Specia | Marco Turchi
Proceedings of the Second Conference on Machine Translation

pdf bib
CASICT-DCU Neural Machine Translation Systems for WMT17
Jinchao Zhang | Peerachet Porkaew | Jiawei Hu | Qiuye Zhao | Qun Liu
Proceedings of the Second Conference on Machine Translation

pdf bib
DCU System Report on the WMT 2017 Multi-modal Machine Translation Task
Iacer Calixto | Koel Dutta Chowdhury | Qun Liu
Proceedings of the Second Conference on Machine Translation

pdf bib
Blend: a Novel Combined MT Metric Based on Direct Assessment — CASICT-DCU submission to WMT17 Metrics Task
Qingsong Ma | Yvette Graham | Shugen Wang | Qun Liu
Proceedings of the Second Conference on Machine Translation

pdf bib
Deep Neural Machine Translation with Linear Associative Unit
Mingxuan Wang | Zhengdong Lu | Jie Zhou | Qun Liu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Deep Neural Networks (DNNs) have provably enhanced the state-of-the-art Neural Machine Translation (NMT) with its capability in modeling complex functions and capturing complex linguistic structures. However NMT with deep architecture in its encoder or decoder RNNs often suffer from severe gradient diffusion due to the non-linear recurrent activations, which often makes the optimization much more difficult. To address this problem we propose a novel linear associative units (LAU) to reduce the gradient propagation path inside the recurrent unit. Different from conventional approaches (LSTM unit and GRU), LAUs uses linear associative connections between input and output of the recurrent unit, which allows unimpeded information flow through both space and time The model is quite simple, but it is surprisingly effective. Our empirical study on Chinese-English translation shows that our model with proper configuration can improve by 11.7 BLEU upon Groundhog and the best reported on results in the same setting. On WMT14 English-German task and a larger WMT14 English-French task, our model achieves comparable results with the state-of-the-art.

pdf bib
Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation
Jinchao Zhang | Mingxuan Wang | Qun Liu | Jie Zhou
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper proposes three distortion models to explicitly incorporate the word reordering knowledge into attention-based Neural Machine Translation (NMT) for further improving translation performance. Our proposed models enable attention mechanism to attend to source words regarding both the semantic requirement and the word reordering penalty. Experiments on Chinese-English translation show that the approaches can improve word alignment quality and achieve significant translation improvements over a basic attention-based NMT by large margins. Compared with previous works on identical corpora, our system achieves the state-of-the-art performance on translation quality.

pdf bib
Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
Chris Hokamp | Qun Liu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model which generates sequences token by token. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate auxillary knowledge into a model’s output without requiring any modification of the parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.

pdf bib
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation
Iacer Calixto | Qun Liu | Nick Campbell
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. Our decoder learns to attend to source-language words and parts of an image independently by means of two separate attention mechanisms as it generates words in the target language. We find that our model can efficiently exploit not just back-translated in-domain multi-modal data but also large general-domain text-only MT corpora. We also report state-of-the-art results on the Multi30k data set.

pdf bib
Sentence-Level Multilingual Multi-modal Embedding for Natural Language Processing
Iacer Calixto | Qun Liu
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We propose a novel discriminative ranking model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality. To that end, we introduce an objective function that uses pairwise ranking adapted to the case of three or more input sources. We compare our model against different baselines, and evaluate the robustness of our embeddings on image–sentence ranking (ISR), semantic textual similarity (STS), and neural machine translation (NMT). We find that the additional multilingual signals lead to improvements on all three tasks, and we highlight that our model can be used to consistently improve the adequacy of translations generated with NMT models when re-ranking n-best lists.

pdf bib
Incorporating Global Visual Features into Attention-based Neural Machine Translation.
Iacer Calixto | Qun Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. Global image features are extracted using a pre-trained convolutional neural network and are incorporated (i) as words in the source sentence, (ii) to initialise the encoder hidden state, and (iii) as additional data to initialise the decoder hidden state. In our experiments, we evaluate translations into English and German, how different strategies to incorporate global image features compare and which ones perform best. We also study the impact that adding synthetic multi-modal, multilingual data brings and find that the additional data have a positive impact on multi-modal NMT models. We report new state-of-the-art results and our best models also significantly improve on a comparable phrase-based Statistical MT (PBSMT) model trained on the Multi30k data set according to all metrics evaluated. To the best of our knowledge, it is the first time a purely neural model significantly improves over a PBSMT model on all metrics evaluated on this data set.

pdf bib
Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation
Qingsong Ma | Yvette Graham | Timothy Baldwin | Qun Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Monolingual evaluation of Machine Translation (MT) aims to simplify human assessment by requiring assessors to compare the meaning of the MT output with a reference translation, opening up the task to a much larger pool of genuinely qualified evaluators. Monolingual evaluation runs the risk, however, of bias in favour of MT systems that happen to produce translations superficially similar to the reference and, consistent with this intuition, previous investigations have concluded monolingual assessment to be strongly biased in this respect. On re-examination of past analyses, we identify a series of potential analytical errors that force some important questions to be raised about the reliability of past conclusions, however. We subsequently carry out further investigation into reference bias via direct human assessment of MT adequacy via quality controlled crowd-sourcing. Contrary to both intuition and past conclusions, results for show no significant evidence of reference bias in monolingual evaluation of MT.

pdf bib
Exploiting Cross-Sentence Context for Neural Machine Translation
Longyue Wang | Zhaopeng Tu | Andy Way | Qun Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a cross-sentence context-aware approach and investigate the influence of historical contextual information on the performance of neural machine translation (NMT). First, this history is summarized in a hierarchical way. We then integrate the historical representation into NMT in two strategies: 1) a warm-start of encoder and decoder states, and 2) an auxiliary context source for updating decoder states. Experimental results on a large Chinese-English translation task show that our approach significantly improves upon a strong attention-based NMT system by up to +2.1 BLEU points.

pdf bib
Semantics-Enhanced Task-Oriented Dialogue Translation: A Case Study on Hotel Booking
Longyue Wang | Jinhua Du | Liangyou Li | Zhaopeng Tu | Andy Way | Qun Liu
Proceedings of the IJCNLP 2017, System Demonstrations

We showcase TODAY, a semantics-enhanced task-oriented dialogue translation system, whose novelties are: (i) task-oriented named entity (NE) definition and a hybrid strategy for NE recognition and translation; and (ii) a novel grounded semantic method for dialogue understanding and task-order management. TODAY is a case-study demo which can efficiently and accurately assist customers and agents in different languages to reach an agreement in a dialogue for the hotel booking.

pdf bib
ADAPT Centre Cone Team at IJCNLP-2017 Task 5: A Similarity-Based Logistic Regression Approach to Multi-choice Question Answering in an Examinations Shared Task
Daria Dzendzik | Alberto Poncelas | Carl Vogel | Qun Liu
Proceedings of the IJCNLP 2017, Shared Tasks

We describe the work of a team from the ADAPT Centre in Ireland in addressing automatic answer selection for the Multi-choice Question Answering in Examinations shared task. The system is based on a logistic regression over the string similarities between question, answer, and additional text. We obtain the highest grade out of six systems: 48.7% accuracy on a validation set (vs. a baseline of 29.45%) and 45.6% on a test set.

pdf bib
If You Can’t Beat Them Join Them: Handcrafted Features Complement Neural Nets for Non-Factoid Answer Reranking
Dasha Bogdanova | Jennifer Foster | Daria Dzendzik | Qun Liu
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We show that a neural approach to the task of non-factoid answer reranking can benefit from the inclusion of tried-and-tested handcrafted features. We present a neural network architecture based on a combination of recurrent neural networks that are used to encode questions and answers, and a multilayer perceptron. We show how this approach can be combined with additional features, in particular, the discourse features used by previous research. Our neural approach achieves state-of-the-art performance on a public dataset from Yahoo! Answers and its performance is further improved by incorporating the discourse features. Additionally, we present a new dataset of Ask Ubuntu questions where the hybrid approach also achieves good results.

pdf bib
Neural Automatic Post-Editing Using Prior Alignment and Reranking
Santanu Pal | Sudip Kumar Naskar | Mihaela Vela | Qun Liu | Josef van Genabith
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a second-stage machine translation (MT) system based on a neural machine translation (NMT) approach to automatic post-editing (APE) that improves the translation quality provided by a first-stage MT system. Our APE system (APE_Sym) is an extended version of an attention based NMT model with bilingual symmetry employing bidirectional models, mt–pe and pe–mt. APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system. Re-ranking (APE_Rerank) of the n-best translations from the phrase-based APE and APE_Sym systems provides further substantial improvements over the symmetric neural APE model. Human evaluation confirms that the APE_Rerank generated PE translations improve on the previous best neural APE system at WMT 2016.

pdf bib
Improving Evaluation of Document-level Machine Translation Quality Estimation
Yvette Graham | Qingsong Ma | Timothy Baldwin | Qun Liu | Carla Parra | Carolina Scarton
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable. In this paper, we explore the validity of human annotations currently employed in the evaluation of document-level quality estimation for machine translation (MT). We demonstrate the degree to which MT system rankings are dependent on weights employed in the construction of the gold standard, before proposing direct human assessment as a valid alternative. Experiments show direct assessment (DA) scores for documents to be highly reliable, achieving a correlation of above 0.9 in a self-replication experiment, in addition to a substantial estimated cost reduction through quality controlled crowd-sourcing. The original gold standard based on post-edits incurs a 10–20 times greater cost than DA.

pdf bib
Context-Aware Graph Segmentation for Graph-Based Translation
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

In this paper, we present an improved graph-based translation model which segments an input graph into node-induced subgraphs by taking source context into consideration. Translations are generated by combining subgraph translations left-to-right using beam search. Experiments on Chinese–English and German–English demonstrate that the context-aware segmentation significantly improves the baseline graph-based model.

2016

pdf bib
Memory-enhanced Decoder for Neural Machine Translation
Mingxuan Wang | Zhengdong Lu | Hang Li | Qun Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Variational Neural Discourse Relation Recognizer
Biao Zhang | Deyi Xiong | Jinsong Su | Qun Liu | Rongrong Ji | Hong Duan | Min Zhang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Neural Network for Heterogeneous Annotations
Hongshen Chen | Yue Zhang | Qun Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Calculating the percentage reduction in translator effort when using machine translation
Andrzej Zydrón | Qun Liu
Proceedings of Translating and the Computer 38

pdf bib
Graph-Based Translation Via Graph Segmentation
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Phrase-Level Combination of SMT and TM Using Constrained Word Lattice
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tool
Xiaofeng Wu | Jinhua Du | Qun Liu | Andy Way
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents ProphetMT, a tree-based SMT-driven Controlled Language (CL) authoring and post-editing tool. ProphetMT employs the source-side rules in a translation model and provides them as auto-suggestions to users. Accordingly, one might say that users are writing in a Controlled Language that is understood by the computer. ProphetMT also allows users to easily attach structural information as they compose content. When a specific rule is selected, a partial translation is promptly generated on-the-fly with the help of the structural information. Our experiments conducted on English-to-Chinese show that our proposed ProphetMT system can not only better regularise an author’s writing behaviour, but also significantly improve translation fluency which is vital to reduce the post-editing time. Additionally, when the writing and translation process is over, ProphetMT can provide an effective colour scheme to further improve the productivity of post-editors by explicitly featuring the relations between the source and target rules.

pdf bib
Automatic Construction of Discourse Corpora for Dialogue Translation
Longyue Wang | Xiaojun Zhang | Zhaopeng Tu | Andy Way | Qun Liu
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, a novel approach is proposed to automatically construct parallel discourse corpus for dialogue machine translation. Firstly, the parallel subtitle data and its corresponding monolingual movie script data are crawled and collected from Internet. Then tags such as speaker and discourse boundary from the script data are projected to its subtitle data via an information retrieval approach in order to map monolingual discourse to bilingual texts. We not only evaluate the mapping results, but also integrate speaker information into the translation. Experiments show our proposed method can achieve 81.79% and 98.64% accuracy on speaker and dialogue boundary annotation, and speaker-based language model adaptation can obtain around 0.5 BLEU points improvement in translation qualities. Finally, we publicly release around 100K parallel discourse data with manual speaker and dialogue boundary annotation.

pdf bib
Extending Phrase-Based Translation with Dependencies by Using Graphs
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)

pdf bib
Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
Peyman Passban | Chris Hokamp | Andy Way | Qun Liu
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

pdf bib
Combining Translation Memories and Syntax-Based SMT: Experiments with Real Industrial Data
Liangyou Li | Carla Parra Escartin | Qun Liu
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

pdf bib
Achieving Accurate Conclusions in Evaluation of Automatic Machine Translation Metrics
Yvette Graham | Qun Liu
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Novel Approach to Dropped Pronoun Translation
Longyue Wang | Zhaopeng Tu | Xiaojun Zhang | Hang Li | Andy Way | Qun Liu
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A subtree-based factorization of dependency parsing
Qiuye Zhao | Qun Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a dependency parsing pipeline, in which the parsing of long-distance projections and localized dependencies are explicitly decomposed at the input level. A chosen baseline dependency parsing model performs only on ‘carved’ sequences at the second stage, which are transformed from coarse constituent parsing outputs at the first stage. When k-best constituent parsing outputs are kept, a third-stage is required to search for an optimal combination of the overlapped dependency subtrees. In this sense, our dependency model is subtree-factored. We explore alternative approaches for scoring subtrees, including feature-based models as well as continuous representations. The search for optimal subset to combine is formulated as an ILP problem. This framework especially benefits the models poor on long sentences, generally improving baselines by 0.75-1.28 (UAS) on English, achieving comparable performance with high-order models but faster. For Chinese, the most notable increase is as high as 3.63 (UAS) when the proposed framework is applied to first-order parsing models.

pdf bib
Fast Gated Neural Domain Adaptation: Language Model as a Case Study
Jian Zhang | Xiaofeng Wu | Andy Way | Qun Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Neural network training has been shown to be advantageous in many natural language processing applications, such as language modelling or machine translation. In this paper, we describe in detail a novel domain adaptation mechanism in neural network training. Instead of learning and adapting the neural network on millions of training sentences – which can be very time-consuming or even infeasible in some cases – we design a domain adaptation gating mechanism which can be used in recurrent neural networks and quickly learn the out-of-domain knowledge directly from the word vector representations with little speed overhead. In our experiments, we use the recurrent neural network language model (LM) as a case study. We show that the neural LM perplexity can be reduced by 7.395 and 12.011 using the proposed domain adaptation mechanism on the Penn Treebank and News data, respectively. Furthermore, we show that using the domain-adapted neural LM to re-rank the statistical machine translation n-best list on the French-to-English language pair can significantly improve translation quality.

pdf bib
Topic-Informed Neural Machine Translation
Jian Zhang | Liangyou Li | Andy Way | Qun Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In recent years, neural machine translation (NMT) has demonstrated state-of-the-art machine translation (MT) performance. It is a new approach to MT, which tries to learn a set of parameters to maximize the conditional probability of target sentences given source sentences. In this paper, we present a novel approach to improve the translation performance in NMT by conveying topic knowledge during translation. The proposed topic-informed NMT can increase the likelihood of selecting words from the same topic and domain for translation. Experimentally, we demonstrate that topic-informed NMT can achieve a 1.15 (3.3% relative) and 1.67 (5.4% relative) absolute improvement in BLEU score on the Chinese-to-English language pair using NIST 2004 and 2005 test sets, respectively, compared to NMT without topic information.

pdf bib
Interactive Attention for Neural Machine Translation
Fandong Meng | Zhengdong Lu | Hang Li | Qun Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Conventional attention-based Neural Machine Translation (NMT) conducts dynamic alignment in generating the target sentence. By repeatedly reading the representation of source sentence, which keeps fixed after generated by the encoder (Bahdanau et al., 2015), the attention mechanism has greatly enhanced state-of-the-art NMT. In this paper, we propose a new attention mechanism, called INTERACTIVE ATTENTION, which models the interaction between the decoder and the representation of source sentence during translation by both reading and writing operations. INTERACTIVE ATTENTION can keep track of the interaction history and therefore improve the translation performance. Experiments on NIST Chinese-English translation task show that INTERACTIVE ATTENTION can achieve significant improvements over both the previous attention-based NMT baseline and some state-of-the-art variants of attention-based NMT (i.e., coverage models (Tu et al., 2016)). And neural machine translator with our INTERACTIVE ATTENTION can outperform the open source attention-based NMT system Groundhog by 4.22 BLEU points and the open source phrase-based system Moses by 3.94 BLEU points averagely on multiple test sets.

pdf bib
Enriching Phrase Tables for Statistical Machine Translation Using Mixed Embeddings
Peyman Passban | Qun Liu | Andy Way
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

The phrase table is considered to be the main bilingual resource for the phrase-based statistical machine translation (PBSMT) model. During translation, a source sentence is decomposed into several phrases. The best match of each source phrase is selected among several target-side counterparts within the phrase table, and processed by the decoder to generate a sentence-level translation. The best match is chosen according to several factors, including a set of bilingual features. PBSMT engines by default provide four probability scores in phrase tables which are considered as the main set of bilingual features. Our goal is to enrich that set of features, as a better feature set should yield better translations. We propose new scores generated by a Convolutional Neural Network (CNN) which indicate the semantic relatedness of phrase pairs. We evaluate our model in different experimental settings with different language pairs. We observe significant improvements when the proposed features are incorporated into the PBSMT pipeline.

2015

pdf bib
Dependency Graph-to-String Translation
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
ParFDA for Fast Deployment of Accurate Statistical Machine Translation Systems, Benchmarks, and Statistics
Ergun Biçici | Qun Liu | Andy Way
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Referential Translation Machines for Predicting Translation Quality and Related Statistics
Ergun Biçici | Qun Liu | Andy Way
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
CASICT-DCU Participation in WMT2015 Metrics Task
Hui Yu | Qingsong Ma | Xiaofeng Wu | Qun Liu
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
MT Tuning on RED: A Dependency-Based Evaluation Metric
Liangyou Li | Hui Yu | Qun Liu
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Benchmarking SMT Performance for Farsi Using the TEP++ Corpus
Peyman Passban | Andy Way | Qun Liu
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
HandyCAT - An Open-Source Platform for CAT Tool Research
Christopher Hokamp | Qun Liu
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Benchmarking SMT Performance for Farsi Using the TEP++ Corpus
Peyman Passban | Andy Way | Qun Liu
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
HandyCAT - An Open-Source Platform for CAT Tool Research
Chris Hokamp | Qun Liu
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Automatic Adaptation of Annotations
Wenbin Jiang | Yajuan Lü | Liang Huang | Qun Liu
Computational Linguistics, Volume 41, Issue 1 - March 2015

pdf bib
Encoding Source Language with Convolutional Neural Network for Machine Translation
Fandong Meng | Zhengdong Lu | Mingxuan Wang | Hang Li | Wenbin Jiang | Qun Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
genCNN: A Convolutional Architecture for Word Sequence Prediction
Mingxuan Wang | Zhengdong Lu | Hang Li | Wenbin Jiang | Qun Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
The DCU Discourse Parser: A Sense Classification Task
Tsuyoshi Okita | Longyue Wang | Qun Liu
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task

pdf bib
The DCU Discourse Parser for Connective, Argument Identification and Explicit Sense Classification
Longyue Wang | Chris Hokamp | Tsuyoshi Okita | Xiaojun Zhang | Qun Liu
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task

pdf bib
The EXPERT project: Advancing the state of the art in hybrid translation technologies
Constantin Orasan | Alessandro Cattelan | Gloria Corpas Pastor | Josef van Genabith | Manuel Herranz | Juan José Arevalillo | Qun Liu | Khalil Sima’an | Lucia Specia
Proceedings of Translating and the Computer 37

2014

pdf bib
Active Learning for Post-Editing Based Incrementally Retrained MT
Aswarth Abhilash Dara | Josef van Genabith | Qun Liu | John Judge | Antonio Toral
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf bib
A probabilistic feature-based fill-up for SMT
Jian Zhang | Liangyou Li | Andy Way | Qun Liu
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

In this paper, we describe an effective translation model combination approach based on the estimation of a probabilistic Support Vector Machine (SVM). We collect domain knowledge from both in-domain and general-domain corpora inspired by a commonly used data selection algorithm, which we then use as features for the SVM training. Drawing on previous work on binary-featured phrase table fill-up (Nakov, 2008; Bisazza et al., 2011), we substitute the binary feature in the original work with our probabilistic domain-likeness feature. Later, we design two experiments to evaluate the proposed probabilistic feature-based approach on the French-to-English language pair using data provided at WMT07, WMT13 and IWLST11 translation tasks. Our experiments demonstrate that translation performance can gain significant improvements of up to +0.36 and +0.82 BLEU scores by using our probabilistic feature-based translation model fill-up approach compared with the binary featured fill-up approach in both experiments.

pdf bib
Review and analysis of China workshop on machine translation 2013 evaluation
Sitong Yang | Heng Yu | Hongmei Zhao | Qun Liu | Yajuan Lü
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

This paper gives a general review and detailed analysis of China Workshop on Machine Translation (CWMT) Evaluation. Compared with the past CWMT evaluation campaigns, CWMT2013 evaluation is characterized as follows: first, adopting gray-box evaluation which makes the results more replicable and controllable; second, adding one rule-based system as a counterpart; third, carrying out manual evaluations on some specific tasks to give a more comprehensive analysis of the translation errors. Boosted by those new features, our analysis and case study on the evaluation results shows the pros and cons of both rule-based and statistical systems, and reveals some interesting correlations bewteen automatic and manual evaluation metrics on different translation systems.

pdf bib
A discriminative framework of integrating translation memory features into SMT
Liangyou Li | Andy Way | Qun Liu
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

Combining Translation Memory (TM) with Statistical Machine Translation (SMT) together has been demonstrated to be beneficial. In this paper, we present a discriminative framework which can integrate TM into SMT by incorporating TM-related feature functions. Experiments on English–Chinese and English–French tasks show that our system using TM feature functions only from the best fuzzy match performs significantly better than the baseline phrase- based system on both tasks, and our discriminative model achieves comparable results to those of an effective generative model which uses similar features. Furthermore, with the capacity of handling a large amount of features in the discriminative framework, we propose a method to efficiently use multiple fuzzy matches which brings more feature functions and further significantly improves our system.

pdf bib
Parallel FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems
Ergun Biçici | Qun Liu | Andy Way
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
The DCU-ICTCAS MT system at WMT 2014 on German-English Translation Task
Liangyou Li | Xiaofeng Wu | Santiago Cortés Vaíllo | Jun Xie | Andy Way | Qun Liu
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
DCU-Lingo24 Participation in WMT 2014 Hindi-English Translation task
Xiaofeng Wu | Rejwanul Haque | Tsuyoshi Okita | Piyush Arora | Andy Way | Qun Liu
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
DCU Terminology Translation System for Medical Query Subtask at WMT14
Tsuyoshi Okita | Ali Vahid | Andy Way | Qun Liu
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
RED, The DCU-CASICT Submission of Metrics Tasks
Xiaofeng Wu | Hui Yu | Qun Liu
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Transformation and Decomposition for Efficiently Implementing and Improving Dependency-to-String Model In Moses
Liangyou Li | Jun Xie | Andy Way | Qun Liu
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
A Dependency Edge-based Transfer Model for Statistical Machine Translation
Hongshen Chen | Jun Xie | Fandong Meng | Wenbin Jiang | Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A Structured Language Model for Incremental Tree-to-String Translation
Heng Yu | Haitao Mi | Liang Huang | Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Annotation Adaptation and Language Adaptation in NLP
Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
RED: A Reference Dependency Based MT Evaluation Metric
Hui Yu | Xiaofeng Wu | Jun Xie | Wenbin Jiang | Qun Liu | Shouxun Lin
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Augment Dependency-to-String Translation with Fixed and Floating Structures
Jun Xie | Jinan Xu | Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts
Qun Liu | Fei Xia
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts

pdf bib
Syntactic SMT Using a Discriminative Text Generation Model
Yue Zhang | Kai Song | Linfeng Song | Jingbo Zhu | Qun Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Modeling Term Translation for Document-informed Machine Translation
Fandong Meng | Deyi Xiong | Wenbin Jiang | Qun Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Improving Alignment of System Combination by Using Multi-objective Optimization
Tian Xia | Zongcheng Ji | Shaodan Zhai | Yidong Chen | Qun Liu | Shaojun Wang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Translation with Source Constituency and Dependency Trees
Fandong Meng | Jun Xie | Linfeng Song | Yajuan Lü | Qun Liu
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Topic-Triggered Language Model for Statistical Machine Translation
Heng Yu | Jinsong Su | Yajuan Lv | Qun Liu
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Shallow Semantically-Informed PBSMT and HPBSMT
Tsuyoshi Okita | Qun Liu | Josef van Genabith
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
The CNGL-DCU-Prompsit Translation Systems for WMT13
Raphael Rubino | Antonio Toral | Santiago Cortés Vaíllo | Jun Xie | Xiaofeng Wu | Stephen Doherty | Qun Liu
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
DCU Participation in WMT2013 Metrics Task
Xiaofeng Wu | Hui Yu | Qun Liu
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
Wenbin Jiang | Meng Sun | Yajuan Lü | Yating Yang | Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Bilingually-Guided Monolingual Dependency Grammar Induction
Kai Liu | Yajuan Lü | Wenbin Jiang | Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Novel Graph-based Compact Representation of Word Alignment
Qun Liu | Zhaopeng Tu | Shouxun Lin
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Stem Translation with Affix-Based Rule Selection for Agglutinative Languages
Zhiyang Wang | Yajuan Lü | Meng Sun | Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation
Guosheng Ben | Deyi Xiong | Zhiyang Teng | Yajuan Lü | Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Iterative Transformation of Annotation Guidelines for Constituency Parsing
Xiang Li | Wenbin Jiang | Yajuan Lü | Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Machine Translation in CNGL II
Qun Liu
Proceedings of Machine Translation Summit XIV: European projects

2012

pdf bib
ICT:A System Combination for Chinese Semantic Dependency Parsing
Hao Xiong | Qun Liu
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
ICT: A Translation based Method for Cross-lingual Textual Entailment
Fandong Meng | Hao Xiong | Qun Liu
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
Wenbin Jiang | Fandong Meng | Qun Liu | Yajuan Lü
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Left-to-Right Tree-to-String Decoding with Prediction
Yang Feng | Yang Liu | Qun Liu | Trevor Cohn
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
ICT: System Description for CoNLL-2012
Hao Xiong | Qun Liu
Joint Conference on EMNLP and CoNLL - Shared Task

pdf bib
System Combination with Extra Alignment Information
Xiaofeng Wu | Tsuyoshi Okita | Josef van Genabith | Qun Liu
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT

pdf bib
Unsupervised Discriminative Induction of Synchronous Grammar for Machine Translation
Xinyan Xiao | Deyi Xiong | Yang Liu | Qun Liu | Shouxun Lin
Proceedings of COLING 2012

pdf bib
Discriminative Boosting from Dictionary and Raw Text – A Novel Approach to Build A Chinese Word Segmenter
Fandong Meng | Wenbin Jiang | Hao Xiong | Qun Liu
Proceedings of COLING 2012: Posters

pdf bib
Combining Multiple Alignments to Improve Machine Translation
Zhaopeng Tu | Yang Liu | Yifan He | Josef van Genabith | Qun Liu | Shouxun Lin
Proceedings of COLING 2012: Posters

pdf bib
Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information
Jinsong Su | Hua Wu | Haifeng Wang | Yidong Chen | Xiaodong Shi | Huailin Dong | Qun Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Topic Similarity Model for Hierarchical Phrase-based Translation
Xinyan Xiao | Deyi Xiong | Min Zhang | Qun Liu | Shouxun Lin
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Hierarchical Chunk-to-String Translation
Yang Feng | Dongdong Zhang | Mu Li | Qun Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification
Zhaopeng Tu | Yifan He | Jennifer Foster | Josef van Genabith | Qun Liu | Shouxun Lin
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
ETS: An Error Tolerable System for Coreference Resolution
Hao Xiong | Linfeng Song | Fandong Meng | Yang Liu | Qun Liu | Yajuan Lv
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
A novel dependency-to-string model for statistical machine translation
Jun Xie | Haitao Mi | Qun Liu
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
Xinyan Xiao | Yang Liu | Qun Liu | Shouxun Lin
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Relaxed Cross-lingual Projection of Constituent Syntax
Wenbin Jiang | Qun Liu | Yajuan Lv
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Extracting Hierarchical Rules from a Weighted Alignment Matrix
Zhaopeng Tu | Yang Liu | Qun Liu | Shouxun Lin
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Maximum Rank Correlation Training for Statistical Machine Translation
Daqi Zheng | Yifan He | Yang Liu | Qun Liu
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Bagging-based System Combination for Domain Adaption
Linfeng Song | Haitao Mi | Yajuan Lü | Qun Liu
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Multi-granularity Word Alignment and Decoding for Agglutinative Language Translation
Zhiyang Wang | Yajuan Lü | Qun Liu
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Feedback Selecting of Manually Acquired Rules Using Automatic Evaluation
Xianhua Li | Yajuan Lü | Yao Meng | Qun Liu | Hao Yu
Proceedings of the 4th Workshop on Patent Translation

pdf bib
Adjoining Tree-to-String Translation
Yang Liu | Qun Liu | Yajuan Lü
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Statistical Translation Model Based On Source Syntax Structure
Qun Liu | Yang Liu | Haitao Mi
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Dependency Parsing and Projection Based on Word-Pair Classification
Wenbin Jiang | Qun Liu
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Constituency to Dependency Translation with Forests
Haitao Mi | Qun Liu
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Learning Lexicalized Reordering Models from Reordering Graphs
Jinsong Su | Yang Liu | Yajuan Lv | Haitao Mi | Qun Liu
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules
Zhiyang Wang | Yajuan Lv | Qun Liu | Young-Sook Hwang
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
The ICT statistical machine translation system for IWSLT 2010
Hao Xiong | Jun Xie | Hui Yu | Kai Liu | Wei Luo | Haitao Mi | Yang Liu | Yajuan Lü | Qun Liu
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Joint Parsing and Translation
Yang Liu | Qun Liu
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Dependency Forest for Statistical Machine Translation
Zhaopeng Tu | Yang Liu | Young-Sook Hwang | Qun Liu | Shouxun Lin
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Joint Tokenization and Translation
Xinyan Xiao | Yang Liu | Young-Sook Hwang | Qun Liu | Shouxun Lin
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
An Efficient Shift-Reduce Decoding Algorithm for Phrased-Based Machine Translation
Yang Feng | Haitao Mi | Yang Liu | Qun Liu
Coling 2010: Posters

pdf bib
Effective Constituent Projection across Languages
Wenbin Jiang | Yajuan Lv | Yang Liu | Qun Liu
Coling 2010: Posters

pdf bib
Machine Translation with Lattices and Forests
Haitao Mi | Liang Huang | Qun Liu
Coling 2010: Posters

pdf bib
Dependency-Based Bracketing Transduction Grammar for Statistical Machine Translation
Jinsong Su | Yang Liu | Haitao Mi | Hongmei Zhao | Yajuan Lv | Qun Liu
Coling 2010: Posters

pdf bib
Discriminative Word Alignment by Linear Modeling
Yang Liu | Qun Liu | Shouxun Lin
Computational Linguistics, Volume 36, Issue 3 - September 2010

2009

pdf bib
Introduction to China’s CWMT2008 Machine Translation Evaluation
Hongmei Zhao | Jun Xie | Qun Liu | Yajuan Lü | Dongdong Zhang | Mu Li
Proceedings of Machine Translation Summit XII: Papers

pdf bib
Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study
Wenbin Jiang | Liang Huang | Qun Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Improving Tree-to-Tree Translation with Packed Forests
Yang Liu | Yajuan Lü | Qun Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Joint Decoding with Multiple Translation Models
Yang Liu | Haitao Mi | Yang Feng | Qun Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Reducing SMT Rule Table with Monolingual Key Phrase
Zhongjun He | Yao Meng | Yajuan Lü | Hao Yu | Qun Liu
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Sub-Sentence Division for Tree-Based Machine Translation
Hao Xiong | Wenwen Xu | Haitao Mi | Yang Liu | Qun Liu
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Weighted Alignment Matrices for Statistical Machine Translation
Yang Liu | Tian Xia | Xinyan Xiao | Qun Liu
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Lattice-based System Combination for Statistical Machine Translation
Yang Feng | Yang Liu | Haitao Mi | Qun Liu | Yajuan Lü
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Bilingually-Constrained (Monolingual) Shift-Reduce Parsing
Liang Huang | Wenbin Jiang | Qun Liu
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
The ICT statistical machine translation system for the IWSLT 2009
Haitao Mi | Yang Li | Tian Xia | Xinyan Xiao | Yang Feng | Jun Xie | Hao Xiong | Zhaopeng Tu | Daqi Zheng | Yanjuan Lu | Qun Liu
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the ICT Statistical Machine Translation systems that used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2009. For this year’s evaluation, we participated in the Challenge Task (Chinese-English and English-Chinese) and BTEC Task (Chinese-English). And we mainly focus on one new method to improve single system’s translation quality. Specifically, we developed a sentence-similarity based development set selection technique. For each task, we finally submitted the single system who got the maximum BLEU scores on the selected development set. The four single translation systems are based on different techniques: a linguistically syntax-based system, two formally syntax-based systems and a phrase-based system. Typically, we didn’t use any rescoring or system combination techniques in this year’s evaluation.

pdf bib
Improving Statistical Machine Translation Using Domain Bilingual Multiword Expressions
Zhixiang Ren | Yajuan Lü | Jie Cao | Qun Liu | Yun Huang
Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications (MWE 2009)

pdf bib
Automatic Adaptation of Annotation Standards for Dependency Parsing ? Using Projected Treebank as Source Corpus
Wenbin Jiang | Qun Liu
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

2008

pdf bib
Improving Statistical Machine Translation using Lexicalized Rule Selection
Zhongjun He | Qun Liu | Shouxun Lin
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
Wenbin Jiang | Haitao Mi | Qun Liu
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Maximum Entropy based Rule Selection Model for Syntax-based Statistical Machine Translation
Qun Liu | Zhongjun He | Yang Liu | Shouxun Lin
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Refinements in BTG-based Statistical Machine Translation
Deyi Xiong | Min Zhang | AiTi Aw | Haitao Mi | Qun Liu | Shouxun Lin
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
The ICT system description for IWSLT 2008.
Yang Liu | Zhongjun He | Haitao Mi | Yun Huang | Yang Feng | Wenbin Jiang | Yajuan Lu | Qun Liu
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper presents a description for the ICT systems involved in the IWSLT 2008 evaluation campaign. This year, we participated in Chinese-English and English-Chinese translation directions. Four statistical machine translation systems were used: one linguistically syntax-based, two formally syntax-based, and one phrase-based. The outputs of the four SMT systems were fed to a sentence-level system combiner, which was expected to produce better translations than single systems. We will report the results of the four single systems and the combiner on both the development and test sets.

pdf bib
Forest-Based Translation
Haitao Mi | Liang Huang | Qun Liu
Proceedings of ACL-08: HLT

pdf bib
A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
Wenbin Jiang | Liang Huang | Qun Liu | Yajuan Lü
Proceedings of ACL-08: HLT

pdf bib
Partial Matching Strategy for Phrase-based Statistical Machine Translation
Zhongjun He | Qun Liu | Shouxun Lin
Proceedings of ACL-08: HLT, Short Papers

2007

pdf bib
A Dependency Treelet String Correspondence Model for Statistical Machine Translation
Deyi Xiong | Qun Liu | Shouxun Lin
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
Improving Statistical Machine Translation Performance by Training Data Selection and Optimization
Yajuan Lü | Jin Huang | Qun Liu
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Forest-to-String Statistical Translation Rules
Yang Liu | Yun Huang | Qun Liu | Shouxun Lin
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
The ICT statistical machine translation systems for IWSLT 2007
Zhongjun He | Haitao Mi | Yang Liu | Deyi Xiong | Weihua Luo | Yun Huang | Zhixiang Ren | Yajuan Lu | Qun Liu
Proceedings of the Fourth International Workshop on Spoken Language Translation

In this paper, we give an overview of the ICT statistical machine translation systems for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year’s evaluation, we participated in the Chinese-English transcript translation task, and developed three systems based on different techniques: a formally syntax-based system Bruin, an extended phrase-based system Confucius and a linguistically syntax-based system Lynx. We will describe the models of these three systems, and compare their performance in detail. We set Bruin as our primary system, which ranks 2 among the 15 primary results according to the official evaluation results.

2006

pdf bib
Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
Deyi Xiong | Qun Liu | Shouxun Lin
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Tree-to-String Alignment Template for Statistical Machine Translation
Yang Liu | Qun Liu | Shouxun Lin
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Introduction to China’s HTRDP Machine Translation Evaluation
Qun Liu | Hongxu Hou | Shouxun Lin | Yueliang Qian | Yujie Zhang | Hitoshi Isahara
Proceedings of Machine Translation Summit X: Invited papers

Since 1994, China’s HTRDP machine translation evaluation has been conducted for five times. Systems of various translation directions between Chinese, English, Japanese and French have been tested. Both human evaluation and automatic evaluation are conducted in HTRDP evaluation. In recent years, the evaluation was organized jointly with NICT of Japan. This paper introduces some details of this evaluation.

pdf bib
A Multi-aligner for Japanese-Chinese Parallel Corpora
Yujie Zhang | Qun Liu | Qing Ma | Hitoshi Isahara
Proceedings of Machine Translation Summit X: Papers

Automatic word alignment is an important technology for extracting translation knowledge from parallel corpora. However, automatic techniques cannot resolve this problem completely because of variances in translations. We therefore need to investigate the performance potential of automatic word alignment and then decide how to suitably apply it. In this paper we first propose a lexical knowledge-based approach to word alignment on a Japanese-Chinese corpus. Then we evaluate the performance of the proposed approach on the corpus. At the same time we also apply a statistics-based approach, the well-known toolkit GIZA++, to the same test data. Through comparison of the performances of the two approaches, we propose a multi-aligner, exploiting the lexical knowledge-based aligner and the statistics-based aligner at the same time. Quantitative results confirmed the effectiveness of the multi-aligner.

pdf bib
Log-Linear Models for Word Alignment
Yang Liu | Qun Liu | Shouxun Lin
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Parsing the Penn Chinese Treebank with Semantic Knowledge
Deyi Xiong | Shuanglong Li | Qun Liu | Shouxun Lin | Yueliang Qian
Second International Joint Conference on Natural Language Processing: Full Papers

2003

pdf bib
Chinese Named Entity Recognition Using Role Model
Hua-Ping Zhang | Qun Liu | Hong-Kui Yu | Xue-Qi Cheng | Shuo Bai
International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 2, August 2003

pdf bib
Chinese Lexical Analysis Using Hierarchical Hidden Markov Model
Hua-Ping Zhang | Qun Liu | Xue-Qi Cheng | Hao Zhang | Hong-Kui Yu
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

pdf bib
HHMM-based Chinese Lexical Analyzer ICTCLAS
Hua-Ping Zhang | Hong-Kui Yu | De-Yi Xiong | Qun Liu
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

2002

pdf bib
A Character-net Based Chinese Text Segmentation Method
Lixin Zhou | Qun Liu
COLING-02: SEMANET: Building and Using Semantic Networks

pdf bib
Automatic Recognition of Chinese Unknown Words Based on Roles Tagging
Kevin Zhang | Qun Liu | Hao Zhang | Xue-Qi Cheng
COLING-02: The First SIGHAN Workshop on Chinese Language Processing

pdf bib
基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In Chinese]
Qun Liu | Sujian Li
International Journal of Computational Linguistics & Chinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational Chinese Lexical Semantics

1998

pdf bib
TransEasy: A Chinese-English machine translation system based on hybrid approach
Qun Liu | Shiwen Yu
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: System Descriptions

This paper describes the progress of a machine translation system from Chinese to English. The system is based on a reusable platform of MT software components. It’s a rule-based system, and some statistical algorithms are used as heuristic functions in parsing as well. There are about 50,000 Chinese words and 400 global parsing rules in the system. The system got a good result in a public test of MT system in China in Mar. 1998. It is a research vehicle up to now.
Search
Co-authors