Hui Liu


2022

pdf bib
Toward Annotator Group Bias in Crowdsourcing
Haochen Liu | Joseph Thekinen | Sinem Mollaoglu | Da Tang | Ji Yang | Youlong Cheng | Hui Liu | Jiliang Tang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Crowdsourcing has emerged as a popular approach for collecting annotated data to train supervised machine learning models. However, annotator bias can lead to defective annotations. Though there are a few works investigating individual annotator bias, the group effects in annotators are largely overlooked. In this work, we reveal that annotators within the same demographic group tend to show consistent group bias in annotation tasks and thus we conduct an initial study on annotator group bias. We first empirically verify the existence of annotator group bias in various real-world crowdsourcing datasets. Then, we develop a novel probabilistic graphical framework GroupAnno to capture annotator group bias with an extended Expectation Maximization (EM) algorithm. We conduct experiments on both synthetic and real-world datasets. Experimental results demonstrate the effectiveness of our model in modeling annotator group bias in label aggregation and model learning over competitive baselines.

2021

pdf bib
Video Paragraph Captioning as a Text Summarization Task
Hui Liu | Xiaojun Wan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Video paragraph captioning aims to generate a set of coherent sentences to describe a video that contains several events. Most previous methods simplify this task by using ground-truth event segments. In this work, we propose a novel framework by taking this task as a text summarization task. We first generate lots of sentence-level captions focusing on different video clips and then summarize these captions to obtain the final paragraph caption. Our method does not depend on ground-truth event segments. Experiments on two popular datasets ActivityNet Captions and YouCookII demonstrate the advantages of our new framework. On the ActivityNet dataset, our method even outperforms some previous methods using ground-truth event segment labels.

pdf bib
Enhancing Descriptive Image Captioning with Natural Language Inference
Zhan Shi | Hui Liu | Xiaodan Zhu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Generating descriptive sentences that convey non-trivial, detailed, and salient information about images is an important goal of image captioning. In this paper we propose a novel approach to encourage captioning models to produce more detailed captions using natural language inference, based on the motivation that, among different captions of an image, descriptive captions are more likely to entail less descriptive captions. Specifically, we construct directed inference graphs for reference captions based on natural language inference. A PageRank algorithm is then employed to estimate the descriptiveness score of each node. Built on that, we use reference sampling and weighted designated rewards to guide captioning to generate descriptive captions. The results on MSCOCO show that the proposed method outperforms the baselines significantly on a wide range of conventional and descriptiveness-related evaluation metrics.

pdf bib
Retrieval, Analogy, and Composition: A framework for Compositional Generalization in Image Captioning
Zhan Shi | Hui Liu | Martin Renqiang Min | Christopher Malon | Li Erran Li | Xiaodan Zhu
Findings of the Association for Computational Linguistics: EMNLP 2021

Image captioning systems are expected to have the ability to combine individual concepts when describing scenes with concept combinations that are not observed during training. In spite of significant progress in image captioning with the help of the autoregressive generation framework, current approaches fail to generalize well to novel concept combinations. We propose a new framework that revolves around probing several similar image caption training instances (retrieval), performing analogical reasoning over relevant entities in retrieved prototypes (analogy), and enhancing the generation process with reasoning outcomes (composition). Our method augments the generation model by referring to the neighboring instances in the training set to produce novel concept combinations in generated captions. We perform experiments on the widely used image captioning benchmarks. The proposed models achieve substantial improvement over the compared baselines on both composition-related evaluation metrics and conventional image captioning metrics.

pdf bib
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning
Hui Liu | Danqing Zhang | Bing Yin | Xiaodan Zhu
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Exploiting label hierarchies has become a promising approach to tackling the zero-shot multi-label text classification (ZS-MTC) problem. Conventional methods aim to learn a matching model between text and labels, using a graph encoder to incorporate label hierarchies to obtain effective label representations (Rios and Kavuluru, 2018). More recently, pretrained models like BERT (Devlin et al., 2018) have been used to convert classification tasks into a textual entailment task (Yin et al., 2019). This approach is naturally suitable for the ZS-MTC task. However, pretrained models are underexplored in the existing work because they do not generate individual vector representations for text or labels, making it unintuitive to combine them with conventional graph encoding methods. In this paper, we explore to improve pretrained models with label hierarchies on the ZS-MTC task. We propose a Reinforced Label Hierarchy Reasoning (RLHR) approach to encourage interdependence among labels in the hierarchies during training. Meanwhile, to overcome the weakness of flat predictions, we design a rollback algorithm that can remove logical errors from predictions during inference. Experimental results on three real-life datasets show that our approach achieves better performance and outperforms previous non-pretrained methods on the ZS-MTC task.

pdf bib
Unsupervised Conversation Disentanglement through Co-Training
Hui Liu | Zhan Shi | Xiaodan Zhu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Conversation disentanglement aims to separate intermingled messages into detached sessions, which is a fundamental task in understanding multi-party conversations. Existing work on conversation disentanglement relies heavily upon human-annotated datasets, which is expensive to obtain in practice. In this work, we explore training a conversation disentanglement model without referencing any human annotations. Our method is built upon the deep co-training algorithm, which consists of two neural networks: a message-pair classifier and a session classifier. The former is responsible of retrieving local relations between two messages while the latter categorizes a message to a session by capturing context-aware information. Both the two networks are initialized respectively with pseudo data built from the unannotated corpus. During the deep co-training process, we use the session classifier as a reinforcement learning component to learn a session assigning policy by maximizing the local rewards given by the message-pair classifier. For the message-pair classifier, we enrich its training data by retrieving message pairs with high confidence from the disentangled sessions predicted by the session classifier. Experimental results on the large Movie Dialogue Dataset demonstrate that our proposed approach achieves competitive performance compared to previous supervised methods. Further experiments show that the predicted disentangled conversations can promote the performance on the downstream task of multi-party response selection.

pdf bib
A Semantic Filter Based on Relations for Knowledge Graph Completion
Zongwei Liang | Junan Yang | Hui Liu | Keju Huang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Knowledge graph embedding, representing entities and relations in the knowledge graphs with high-dimensional vectors, has made significant progress in link prediction. More researchers have explored the representational capabilities of models in recent years. That is, they investigate better representational models to fit symmetry/antisymmetry and combination relationships. The current embedding models are more inclined to utilize the identical vector for the same entity in various triples to measure the matching performance. The observation that measuring the rationality of specific triples means comparing the matching degree of the specific attributes associated with the relations is well-known. Inspired by this fact, this paper designs Semantic Filter Based on Relations(SFBR) to extract the required attributes of the entities. Then the rationality of triples is compared under these extracted attributes through the traditional embedding models. The semantic filter module can be added to most geometric and tensor decomposition models with minimal additional memory. experiments on the benchmark datasets show that the semantic filter based on relations can suppress the impact of other attribute dimensions and improve link prediction performance. The tensor decomposition models with SFBR have achieved state-of-the-art.

2020

pdf bib
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning
Haochen Liu | Wentao Wang | Yiqi Wang | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Dialogue systems play an increasingly important role in various aspects of our daily life. It is evident from recent research that dialogue systems trained on human conversation data are biased. In particular, they can produce responses that reflect people’s gender prejudice. Many debiasing methods have been developed for various NLP tasks, such as word embedding. However, they are not directly applicable to dialogue systems because they are likely to force dialogue models to generate similar responses for different genders. This greatly degrades the diversity of the generated responses and immensely hurts the performance of the dialogue models. In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Extensive experiments on two real-world conversation datasets show that our framework significantly reduces gender bias in dialogue models while maintaining the response quality.

pdf bib
Shallow-to-Deep Training for Neural Machine Translation
Bei Li | Ziyang Wang | Hui Liu | Yufan Jiang | Quan Du | Tong Xiao | Huizhen Wang | Jingbo Zhu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but training an extremely deep encoder is time consuming. Moreover, why deep models help NMT is an open question. In this paper, we investigate the behavior of a well-tuned deep Transformer system. We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly. This inspires us to develop a shallow-to-deep training method that learns deep models by stacking shallow models. In this way, we successfully train a Transformer system with a 54-layer encoder. Experimental results on WMT’16 English-German and WMT’14 English-French translation tasks show that it is 1:4 faster than training from scratch, and achieves a BLEU score of 30:33 and 43:29 on two tasks. The code is publicly available at https://github.com/libeineu/SDT-Training.

pdf bib
Does Gender Matter? Towards Fairness in Dialogue Systems
Haochen Liu | Jamell Dacon | Wenqi Fan | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 28th International Conference on Computational Linguistics

Recently there are increasing concerns about the fairness of Artificial Intelligence (AI) in real-world applications such as computer vision and recommendations. For example, recognition algorithms in computer vision are unfair to black people such as poorly detecting their faces and inappropriately identifying them as “gorillas”. As one crucial application of AI, dialogue systems have been extensively applied in our society. They are usually built with real human conversational data; thus they could inherit some fairness issues which are held in the real world. However, the fairness of dialogue systems has not been well investigated. In this paper, we perform a pioneering study about the fairness issues in dialogue systems. In particular, we construct a benchmark dataset and propose quantitative measures to understand fairness in dialogue models. Our studies demonstrate that popular dialogue models show significant prejudice towards different genders and races. Besides, to mitigate the bias in dialogue systems, we propose two simple but effective debiasing methods. Experiments show that our methods can reduce the bias in dialogue systems significantly. The dataset and the implementation are released to foster fairness research in dialogue systems.

pdf bib
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
Bei Li | Hui Liu | Ziyang Wang | Yufan Jiang | Tong Xiao | Jingbo Zhu | Tongran Liu | Changliang Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in document-level neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods.

pdf bib
Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization
Yue Cao | Hui Liu | Xiaojun Wan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Cross-lingual summarization is the task of generating a summary in one language given a text in a different language. Previous works on cross-lingual summarization mainly focus on using pipeline methods or training an end-to-end model using the translated parallel data. However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time. In this paper, we propose to ease the cross-lingual summarization training by jointly learning to align and summarize. We design relevant loss functions to train this framework and propose several methods to enhance the isomorphism and cross-lingual transfer between languages. Experimental results show that our model can outperform competitive models in most cases. In addition, we show that our model even has the ability to generate cross-lingual summaries without access to any cross-lingual corpus.

pdf bib
The NiuTrans System for the WMT20 Quality Estimation Shared Task
Chi Hu | Hui Liu | Kai Feng | Chen Xu | Nuo Xu | Zefan Zhou | Shiqin Yan | Yingfeng Luo | Chenglong Wang | Xia Meng | Tong Xiao | Jingbo Zhu
Proceedings of the Fifth Conference on Machine Translation

This paper describes the submissions of the NiuTrans Team to the WMT 2020 Quality Estimation Shared Task. We participated in all tasks and all language pairs. We explored the combination of transfer learning, multi-task learning and model ensemble. Results on multiple tasks show that deep transformer machine translation models and multilingual pretraining methods significantly improve translation quality estimation performance. Our system achieved remarkable results in multiple level tasks, e.g., our submissions obtained the best results on all tracks in the sentence-level Direct Assessment task.

2019

pdf bib
INS: An Interactive Chinese News Synthesis System
Hui Liu | Wentao Qin | Xiaojun Wan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

Nowadays, we are surrounded by more and more online news articles. Tens or hundreds of news articles need to be read if we wish to explore a hot news event or topic. So it is of vital importance to automatically synthesize a batch of news articles related to the event or topic into a new synthesis article (or overview article) for reader’s convenience. It is so challenging to make news synthesis fully automatic that there is no successful solution by now. In this paper, we put forward a novel Interactive News Synthesis system (i.e. INS), which can help generate news overview articles automatically or by interacting with users. More importantly, INS can serve as a tool for editors to help them finish their jobs. In our experiments, INS performs well on both topic representation and synthesis article generation. A user study also demonstrates the usefulness and users’ satisfaction with the INS tool. A demo video is available at https://youtu.be/7ItteKW3GEk.

pdf bib
The NiuTrans Machine Translation Systems for WMT19
Bei Li | Yinqiao Li | Chen Xu | Ye Lin | Jiqiang Liu | Hui Liu | Ziyang Wang | Yuhao Zhang | Nuo Xu | Zeyang Wang | Kai Feng | Hexuan Chen | Tengbo Liu | Yanyang Li | Qiang Wang | Tong Xiao | Jingbo Zhu
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.

pdf bib
Towards Explainable NLP: A Generative Explanation Framework for Text Classification
Hui Liu | Qingyu Yin | William Yang Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning systems tend to focus on interpreting the outputs or the connections between inputs and outputs. However, the fine-grained information (e.g. textual explanations for the labels) is often ignored, and the systems do not explicitly generate the human-readable explanations. To solve this problem, we propose a novel generative explanation framework that learns to make classification decisions and generate fine-grained explanations at the same time. More specifically, we introduce the explainable factor and the minimum risk training approach that learn to generate more reasonable explanations. We construct two new datasets that contain summaries, rating scores, and fine-grained reasons. We conduct experiments on both datasets, comparing with several strong neural network baseline systems. Experimental results show that our method surpasses all baselines on both datasets, and is able to generate concise explanations at the same time.

2007

pdf bib
Semantic Labeling of Compound Nominalization in Chinese
Jinglei Zhao | Hui Liu | Ruzhan Lu
Proceedings of the Workshop on A Broader Perspective on Multiword Expressions

2006

pdf bib
A Weakly Supervised Learning Approach for Spoken Language Understanding
Wei-Lin Wu | Ru-Zhan Lu | Jian-Yong Duan | Hui Liu | Feng Gao | Yu-Quan Chen
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
An Enhanced Model for Chinese Word Segmentation and Part-of-Speech Tagging
Feng Jiang | Hui Liu | Yuquan Chen | Ruzhan Lu
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing