Houfeng Wang


2024

pdf bib
DVD: Dynamic Contrastive Decoding for Knowledge Amplification in Multi-Document Question Answering
Jing Jin | Houfeng Wang | Hao Zhang | Xiaoguang Li | Zhijiang Guo
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) are widely used in question-answering (QA) systems but often generate information with hallucinations. Retrieval-augmented generation (RAG) offers a potential remedy, yet the uneven retrieval quality and irrelevant contents may distract LLMs.In this work, we address these issues at the generation phase by treating RAG as a multi-document QA task.We propose a novel decoding strategy, Dynamic Contrastive Decoding, which dynamically amplifies knowledge from selected documents during the generation phase. involves constructing inputs batchwise, designing new selection criteria to identify documents worth amplifying, and applying contrastive decoding with a specialized weight calculation to adjust the final logits used for sampling answer tokens. Zero-shot experimental results on ALCE-ASQA, NQ, TQA and PopQA benchmarks show that our method outperforms other decoding strategies. Additionally, we conduct experiments to validate the effectiveness of our selection criteria, weight calculation, and general multi-document scenarios. Our method requires no training and can be integrated with other methods to improve the RAG performance. Our codes will be publicly available at https://github.com/JulieJin-km/Dynamic_Contrastive_Decoding.

pdf bib
SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text Generation
Ziyao Xu | Houfeng Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Compositional generalization is an important ability of language models and has many different manifestations. For data-to-text generation, previous research on this ability is limited to a single manifestation called Systematicity and lacks consideration of large language models (LLMs), which cannot fully cover practical application scenarios. In this work, we propose SPOR, a comprehensive and practical evaluation method for compositional generalization in data-to-text generation. SPOR includes four aspects of manifestations (Systematicity, Productivity, Order invariance, and Rule learnability) and allows high-quality evaluation without additional manual annotations based on existing datasets. We demonstrate SPOR on two different datasets and evaluate some existing language models including LLMs. We find that the models are deficient in various aspects of the evaluation and need further improvement. Our work shows the necessity for comprehensive research on different manifestations of compositional generalization in data-to-text generation and provides a framework for evaluation.

pdf bib
Detection-Correction Structure via General Language Model for Grammatical Error Correction
Wei Li | Houfeng Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Grammatical error correction (GEC) is a task dedicated to rectifying texts with minimal edits, which can be decoupled into two components: detection and correction. However, previous works have predominantly focused on direct correction, with no prior efforts to integrate both into a single model. Moreover, the exploration of the detection-correction paradigm by large language models (LLMs) remains underdeveloped. This paper introduces an integrated detection-correction structure, named DeCoGLM, based on the General Language Model (GLM). The detection phase employs a fault-tolerant detection template, while the correction phase leverages autoregressive mask infilling for localized error correction. Through the strategic organization of input tokens and modification of attention masks, we facilitate multi-task learning within a single model. Our model demonstrates competitive performance against the state-of-the-art models on English and Chinese GEC datasets. Further experiments present the effectiveness of the detection-correction structure in LLMs, suggesting a promising direction for GEC.

pdf bib
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song | Bowen Yu | Hao Lang | Haiyang Yu | Fei Huang | Houfeng Wang | Yongbin Li
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Alignment with human preference prevents large language models (LLMs) from generating misleading or toxic content while requiring high-cost human feedback. Assuming resources of human annotation are limited, there are two different ways of allocating considered: more diverse PROMPTS or more diverse RESPONSES to be labeled. Nonetheless, a straightforward comparison between their impact is absent. In this work, we first control the diversity of both sides according to the number of samples for fine-tuning, which can directly reflect their influence. We find that instead of numerous prompts, more responses but fewer prompts better trigger LLMs for human alignment. Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits. Consequently, a new formulation of prompt diversity is proposed, further implying a linear correlation with the final performance of LLMs after fine-tuning. We also leverage it on data augmentation and conduct experiments to show its effect on different algorithms.

pdf bib
Select High-quality Synthetic QA Pairs to Augment Training Data in MRC under the Reward Guidance of Generative Language Models
Jing Jin | Houfeng Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Synthesizing QA pairs via question generator (QG) for data augmentation is widely used in Machine Reading Comprehension (MRC), especially in data-scarce scenarios like limited labeled data or domain adaptation. However, the quality of generated QA pairs varies, and it is necessary to select the ones with high quality from them. Existing approaches focus on downstream metrics to choose QA pairs, which lacks generalization across different metrics and datasets. In this paper, we propose a general selection method that employs a generative large pre-trained language model as a reward model in a Reinforcement Learning (RL) framework for the training of the selection agent. Our experiments on both generative and extractive datasets demonstrate that our selection method leads to better downstream performance. We also find that using the large language model (LLM) as a reward model is more beneficial than using it as a direct selector or QA model. Furthermore, we assess the selected QA pairs from multiple angles, not just downstream metrics, highlighting their superior quality compared to other methods. Our work has better flexibility across metrics, provides interpretability for the selected data, and expands the potential of leveraging generative large language models in the field of MRC and RL training. Our code is available at https://github.com/JulieJin-km/LLM_RL_Selection.

pdf bib
Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification
Zihan Wang | Peiyi Wang | Houfeng Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification due to its complex taxonomic structure. Nearly all recent HTC works focus on how the labels are structured but ignore the sub-structure of ground-truth labels according to each input text which contains fruitful label co-occurrence information. In this work, we introduce this local hierarchy with an adversarial framework. We propose a HiAdv framework that can fit in nearly all HTC models and optimize them with the local hierarchy as auxiliary information. We test on two typical HTC models and find that HiAdv is effective in all scenarios and is adept at dealing with complex taxonomic hierarchies. Further experiments demonstrate that the promotion of our framework indeed comes from the local hierarchy and the local hierarchy is beneficial for rare classes which have insufficient training data.

pdf bib
Would You Like to Make a Donation? A Dialogue System to Persuade You to Donate
Yuhan Song | Houfeng Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Persuasive dialogue is a type of dialogue commonly used in human daily life in scenarios such as promotion and sales. Its purpose is to influence the decision, attitude or behavior of another person through the dialogue process. Persuasive automated dialogue systems can be applied in a variety of fields such as charity, business, education, and healthcare. Regardless of their amazing abilities, Large Language Models (LLMs) such as ChatGPT still have limitations in persuasion. There is few research dedicated to persuasive dialogue in the current research of automated dialogue systems. In this paper, we introduce a persuasive automated dialogue system. In the system, a context-aware persuasion strategy selection module makes dialogue system flexibly use different persuasion strategies to persuade users; Then a natural language generation module is used to output a response. We also propose a persuasiveness prediction model to automatically evaluate the persuasiveness of generated text. Experimental results show that our dialogue system can achieve better performance on several automated evaluation metrics than baseline models.

2022

pdf bib
Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
Zihan Wang | Peiyi Wang | Lianzhe Huang | Xin Sun | Houfeng Wang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Hierarchical text classification is a challenging subtask of multi-label classification due to its complex label hierarchy. Existing methods encode text and label hierarchy separately and mix their representations for classification, where the hierarchy remains unchanged for all input text. Instead of modeling them separately, in this work, we propose Hierarchy-guided Contrastive Learning (HGCLR) to directly embed the hierarchy into a text encoder. During training, HGCLR constructs positive samples for input text under the guidance of the label hierarchy. By pulling together the input text and its positive sample, the text encoder can learn to generate the hierarchy-aware text representation independently. Therefore, after training, the HGCLR enhanced text encoder can dispense with the redundant hierarchy. Extensive experiments on three benchmark datasets verify the effectiveness of HGCLR.

pdf bib
Adjusting the Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction
Xin Sun | Houfeng Wang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Modern writing assistance applications are always equipped with a Grammatical Error Correction (GEC) model to correct errors in user-entered sentences. Different scenarios have varying requirements for correction behavior, e.g., performing more precise corrections (high precision) or providing more candidates for users (high recall). However, previous works adjust such trade-off only for sequence labeling approaches. In this paper, we propose a simple yet effective counterpart – Align-and-Predict Decoding (APD) for the most popular sequence-to-sequence models to offer more flexibility for the precision-recall trade-off. During inference, APD aligns the already generated sequence with input and adjusts scores of the following tokens. Experiments in both English and Chinese GEC benchmarks show that our approach not only adapts a single model to precision-oriented and recall-oriented inference, but also maximizes its potential to achieve state-of-the-art results. Our code is available at https://github.com/AutoTemp/Align-and-Predict.

pdf bib
M3: A Multi-View Fusion and Multi-Decoding Network for Multi-Document Reading Comprehension
Liang Wen | Houfeng Wang | Yingwei Luo | Xiaolin Wang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Multi-document reading comprehension task requires collecting evidences from different documents for answering questions. Previous research works either use the extractive modeling method to naively integrate the scores from different documents on the encoder side or use the generative modeling method to collect the clues from different documents on the decoder side individually. However, any single modeling method cannot make full of the advantages of both. In this work, we propose a novel method that tries to employ a multi-view fusion and multi-decoding mechanism to achieve it. For one thing, our approach leverages question-centered fusion mechanism and cross-attention mechanism to gather fine-grained fusion of evidence clues from different documents in the encoder and decoder concurrently. For another, our method simultaneously employs both the extractive decoding approach and the generative decoding method to effectively guide the training process. Compared with existing methods, our method can perform both extractive decoding and generative decoding independently and optionally. Our experiments on two mainstream multi-document reading comprehension datasets (Natural Questions and TriviaQA) demonstrate that our method can provide consistent improvements over previous state-of-the-art methods.

pdf bib
HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
Zihan Wang | Peiyi Wang | Tianyu Liu | Binghuai Lin | Yunbo Cao | Zhifang Sui | Houfeng Wang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification due to its complex label hierarchy.Recently, the pretrained language models (PLM)have been widely adopted in HTC through a fine-tuning paradigm. However, in this paradigm, there exists a huge gap between the classification tasks with sophisticated label hierarchy and the masked language model (MLM) pretraining tasks of PLMs and thus the potential of PLMs cannot be fully tapped.To bridge the gap, in this paper, we propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label MLM perspective.Specifically, we construct a dynamic virtual template and label words that take the form of soft prompts to fuse the label hierarchy knowledge and introduce a zero-bounded multi-label cross-entropy loss to harmonize the objectives of HTC and MLM.Extensive experiments show HPT achieves state-of-the-art performances on 3 popular HTC datasets and is adept at handling the imbalance and low resource situations. Our code is available at https://github.com/wzh9969/HPT.

pdf bib
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Lianzhe Huang | Shuming Ma | Dongdong Zhang | Furu Wei | Houfeng Wang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Prompt-based tuning has been proven effective for pretrained language models (PLMs). While most of the existing work focuses on the monolingual prompts, we study the multilingual prompts for multilingual PLMs, especially in the zero-shot cross-lingual setting. To alleviate the effort of designing different prompts for multiple languages, we propose a novel model that uses a unified prompt for all languages, called UniPrompt. Different from the discrete prompts and soft prompts, the unified prompt is model-based and language-agnostic. Specifically, the unified prompt is initialized by a multilingual PLM to produce language-independent representation, after which is fused with the text input. During inference, the prompts can be pre-computed so that no extra computation cost is needed. To collocate with the unified prompt, we propose a new initialization method for the target label word to further improve the model’s transferability across languages. Extensive experiments show that our proposed methods can significantly outperform the strong baselines across different languages. We release data and code to facilitate future research.

pdf bib
Learning Invariant Representation Improves Robustness for MRC Models
Yu Hai | Liang Wen | Haoran Meng | Tianyu Liu | Houfeng Wang
Findings of the Association for Computational Linguistics: EMNLP 2022

The prosperity of Pretrained Language Models(PLM) has greatly promoted the development of Machine Reading Comprehension (MRC). However, these models are vulnerable and not robust to adversarial examples. In this paper, we propose Stable and Contrastive Question Answering (SCQA) to improve invariance of representation to alleviate these robustness issues. Specifically, we first construct positive example pairs which have same answer through data augmentation. Then SCQA learns enhanced representations with better alignment between positive pairs by introducing stability and contrastive loss. Experimental results show that our approach can boost the robustness of QA models cross different MRC tasks and attack sets significantly and consistently.

pdf bib
Original Content Is All You Need! an Empirical Study on Leveraging Answer Summary for WikiHowQA Answer Selection Task
Liang Wen | Juan Li | Houfeng Wang | Yingwei Luo | Xiaolin Wang | Xiaodong Zhang | Zhicong Cheng | Dawei Yin
Proceedings of the 29th International Conference on Computational Linguistics

Answer selection task requires finding appropriate answers to questions from informative but crowdsourced candidates. A key factor impeding its solution by current answer selection approaches is the redundancy and lengthiness issues of crowdsourced answers. Recently, Deng et al. (2020) constructed a new dataset, WikiHowQA, which contains a corresponding reference summary for each original lengthy answer. And their experiments show that leveraging the answer summaries helps to attend the essential information in original lengthy answers and improve the answer selection performance under certain circumstances. However, when given a question and a set of long candidate answers, human beings could effortlessly identify the correct answer without the aid of additional answer summaries since the original answers contain all the information volume that answer summaries contain. In addition, pretrained language models have been shown superior or comparable to human beings on many natural language processing tasks. Motivated by those, we design a series of neural models, either pretraining-based or non-pretraining-based, to check wether the additional answer summaries are helpful for ranking the relevancy degrees of question-answer pairs on WikiHowQA dataset. Extensive automated experiments and hand analysis show that the additional answer summaries are not useful for achieving the best performance.

pdf bib
Multi-Layer Pseudo-Siamese Biaffine Model for Dependency Parsing
Ziyao Xu | Houfeng Wang | Bingdong Wang
Proceedings of the 29th International Conference on Computational Linguistics

Biaffine method is a strong and efficient method for graph-based dependency parsing. However, previous work only used the biaffine method at the end of the dependency parser as a scorer, and its application in multi-layer form is ignored. In this paper, we propose a multi-layer pseudo-Siamese biaffine model for neural dependency parsing. In this model, we modify the biaffine method so that it can be utilized in multi-layer form, and use pseudo-Siamese biaffine module to construct arc weight matrix for final prediction. In our proposed multi-layer architecture, the biaffine method plays important roles in both scorer and attention mechanism at the same time in each layer. We evaluate our model on PTB, CTB, and UD. The model achieves state-of-the-art results on these datasets. Further experiments show the benefits of introducing multi-layer form and pseudo-Siamese module into the biaffine method with low efficiency loss.

2021

pdf bib
Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding
Xin Sun | Tao Ge | Furu Wei | Houfeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC). SAD optimizes the online inference efficiency for GEC by two innovations: 1) it aggressively decodes as many tokens as possible in parallel instead of always decoding only one token in each step to improve computational parallelism; 2) it uses a shallow decoder instead of the conventional Transformer architecture with balanced encoder-decoder depth to reduce the computational cost during inference. Experiments in both English and Chinese GEC benchmarks show that aggressive decoding could yield identical predictions to greedy decoding but with significant speedup for online inference. Its combination with the shallow decoder could offer an even higher online inference speedup over the powerful Transformer baseline without quality loss. Not only does our approach allow a single model to achieve the state-of-the-art results in English GEC benchmarks: 66.4 F0.5 in the CoNLL-14 and 72.9 F0.5 in the BEA-19 test set with an almost 10x online inference speedup over the Transformer-big model, but also it is easily adapted to other languages. Our code is available at https://github.com/AutoTemp/Shallow-Aggressive-Decoding.

pdf bib
Do It Once: An Embarrassingly Simple Joint Matching Approach to Response Selection
Linhao Zhang | Dehong Ma | Sujian Li | Houfeng Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Syntax-Aware Graph Attention Network for Aspect-Level Sentiment Classification
Lianzhe Huang | Xin Sun | Sujian Li | Linhao Zhang | Houfeng Wang
Proceedings of the 28th International Conference on Computational Linguistics

Aspect-level sentiment classification aims to distinguish the sentiment polarities over aspect terms in a sentence. Existing approaches mostly focus on modeling the relationship between the given aspect words and their contexts with attention, and ignore the use of more elaborate knowledge implicit in the context. In this paper, we exploit syntactic awareness to the model by the graph attention network on the dependency tree structure and external pre-training knowledge by BERT language model, which helps to model the interaction between the context and aspect words better. And the subwords of BERT are integrated into the dependency tree graphs, which can obtain more accurate representations of words by graph attention. Experiments demonstrate the effectiveness of our model.

2019

pdf bib
Exploring Sequence-to-Sequence Learning in Aspect Term Extraction
Dehong Ma | Sujian Li | Fangzhao Wu | Xing Xie | Houfeng Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Aspect term extraction (ATE) aims at identifying all aspect terms in a sentence and is usually modeled as a sequence labeling problem. However, sequence labeling based methods cannot make full use of the overall meaning of the whole sentence and have the limitation in processing dependencies between labels. To tackle these problems, we first explore to formalize ATE as a sequence-to-sequence (Seq2Seq) learning task where the source sequence and target sequence are composed of words and labels respectively. At the same time, to make Seq2Seq learning suit to ATE where labels correspond to words one by one, we design the gated unit networks to incorporate corresponding word representation into the decoder, and position-aware attention to pay more attention to the adjacent words of a target word. The experimental results on two datasets show that Seq2Seq learning is effective in ATE accompanied with our proposed gated unit networks and position-aware attention mechanism.

pdf bib
Text Level Graph Neural Network for Text Classification
Lianzhe Huang | Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recently, researches have explored the graph neural network (GNN) techniques on text classification, since GNN does well in handling complex structures and preserving global information. However, previous methods based on GNN are mainly faced with the practical problems of fixed corpus level graph structure which don’t support online testing and high memory consumption. To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. This method removes the burden of dependence between an individual text and entire corpus which support online testing, but still preserve global information. Besides, we build graphs by much smaller windows in the text, which not only extract more local features but also significantly reduce the edge numbers as well as memory consumption. Experiments show that our model outperforms existing models on several text classification datasets even with consuming less memory.

2018

pdf bib
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
Jingjing Xu | Xu Sun | Qi Zeng | Xiaodong Zhang | Xuancheng Ren | Houfeng Wang | Wenjie Li
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The goal of sentiment-to-sentiment “translation” is to change the underlying sentiment of a sentence while keeping its content. The main challenge is the lack of parallel data. To solve this problem, we propose a cycled reinforcement learning method that enables training on unpaired data by collaboration between a neutralization module and an emotionalization module. We evaluate our approach on two review datasets, Yelp and Amazon. Experimental results show that our approach significantly outperforms the state-of-the-art systems. Especially, the proposed method substantially improves the content preservation performance. The BLEU score is improved from 1.64 to 22.46 and from 0.56 to 14.06 on the two datasets, respectively.

pdf bib
Question Condensing Networks for Answer Selection in Community Question Answering
Wei Wu | Xu Sun | Houfeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Answer selection is an important subtask of community question answering (CQA). In a real-world CQA forum, a question is often represented as two parts: a subject that summarizes the main points of the question, and a body that elaborates on the subject in detail. Previous researches on answer selection usually ignored the difference between these two parts and concatenated them as the question representation. In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. In our model, the question subject is the primary part of the question representation, and the question body information is aggregated based on similarity and disparity with the question subject. Experimental results show that QCN outperforms all existing models on two CQA datasets.

pdf bib
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Shuming Ma | Xu Sun | Junyang Lin | Houfeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.

pdf bib
A Neural Question Answering Model Based on Semi-Structured Tables
Hao Wang | Xiaodong Zhang | Shuming Ma | Xu Sun | Houfeng Wang | Mengxiang Wang
Proceedings of the 27th International Conference on Computational Linguistics

Most question answering (QA) systems are based on raw text and structured knowledge graph. However, raw text corpora are hard for QA system to understand, and structured knowledge graph needs intensive manual work, while it is relatively easy to obtain semi-structured tables from many sources directly, or build them automatically. In this paper, we build an end-to-end system to answer multiple choice questions with semi-structured tables as its knowledge. Our system answers queries by two steps. First, it finds the most similar tables. Then the system measures the relevance between each question and candidate table cells, and choose the most related cell as the source of answer. The system is evaluated with TabMCQ dataset, and gets a huge improvement compared to the state of the art.

pdf bib
SGM: Sequence Generation Model for Multi-label Classification
Pengcheng Yang | Xu Sun | Wei Li | Shuming Ma | Wei Wu | Houfeng Wang
Proceedings of the 27th International Conference on Computational Linguistics

Multi-label classification is an important yet challenging task in natural language processing. It is more complex than single-label classification in that the labels tend to be correlated. Existing methods tend to ignore the correlations between labels. Besides, different parts of the text can contribute differently for predicting different labels, which is not considered by existing models. In this paper, we propose to view the multi-label classification task as a sequence generation problem, and apply a sequence generation model with a novel decoder structure to solve it. Extensive experimental results show that our proposed methods outperform previous work by a substantial margin. Further analysis of experimental results demonstrates that the proposed methods not only capture the correlations between labels, but also select the most informative words automatically when predicting different labels.

pdf bib
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu | Xuancheng Ren | Yuanxin Liu | Houfeng Wang | Xu Sun
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The encode-decoder framework has shown recent success in image captioning. Visual attention, which is good at detailedness, and semantic attention, which is good at comprehensiveness, have been separately proposed to ground the caption on the image. In this paper, we propose the Stepwise Image-Topic Merging Network (simNet) that makes use of the two kinds of attention at the same time. At each time step when generating the caption, the decoder adaptively merges the attentive information in the extracted topics and the image according to the generated context, so that the visual information and the semantic information can be effectively combined. The proposed approach is evaluated on two benchmark datasets and reaches the state-of-the-art performances.

pdf bib
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Chen Shi | Qi Chen | Lei Sha | Sujian Li | Xu Sun | Houfeng Wang | Lintao Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.

pdf bib
Phrase-level Self-Attention Networks for Universal Sentence Encoding
Wei Wu | Houfeng Wang | Tianyu Liu | Shuming Ma
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Universal sentence encoding is a hot topic in recent NLP research. Attention mechanism has been an integral part in many sentence encoding models, allowing the models to capture context dependencies regardless of the distance between the elements in the sequence. Fully attention-based models have recently attracted enormous interest due to their highly parallelizable computation and significantly less training time. However, the memory consumption of their models grows quadratically with the sentence length, and the syntactic information is neglected. To this end, we propose Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word’s representation hierarchically with longer-term context dependencies captured in a larger phrase. As a result, the memory consumption can be reduced because the self-attention is performed at the phrase level instead of the sentence level. At the same time, syntactic information can be easily integrated in the model. Experiment results show that PSAN can achieve the state-of-the-art performance across a plethora of NLP tasks including binary and multi-class classification, natural language inference and sentence similarity.

pdf bib
Joint Learning for Targeted Sentiment Analysis
Dehong Ma | Sujian Li | Houfeng Wang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Targeted sentiment analysis (TSA) aims at extracting targets and classifying their sentiment classes. Previous works only exploit word embeddings as features and do not explore more potentials of neural networks when jointly learning the two tasks. In this paper, we carefully design the hierarchical stack bidirectional gated recurrent units (HSBi-GRU) model to learn abstract features for both tasks, and we propose a HSBi-GRU based joint model which allows the target label to have influence on their sentiment label. Experimental results on two datasets show that our joint learning model can outperform other baselines and demonstrate the effectiveness of HSBi-GRU in learning abstract features.

2017

pdf bib
Addressing Domain Adaptation for Chinese Word Segmentation with Global Recurrent Structure
Shen Huang | Xu Sun | Houfeng Wang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Boundary features are widely used in traditional Chinese Word Segmentation (CWS) methods as they can utilize unlabeled data to help improve the Out-of-Vocabulary (OOV) word recognition performance. Although various neural network methods for CWS have achieved performance competitive with state-of-the-art systems, these methods, constrained by the domain and size of the training corpus, do not work well in domain adaptation. In this paper, we propose a novel BLSTM-based neural network model which incorporates a global recurrent structure designed for modeling boundary features dynamically. Experiments show that the proposed structure can effectively boost the performance of Chinese Word Segmentation, especially OOV-Recall, which brings benefits to domain adaptation. We achieved state-of-the-art results on 6 domains of CNKI articles, and competitive results to the best reported on the 4 domains of SIGHAN Bakeoff 2010 data.

pdf bib
Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification
Yizhong Wang | Sujian Li | Jingfeng Yang | Xu Sun | Houfeng Wang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text. To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information. In this work, we explore the idea of incorporating syntactic parse tree into neural networks. Specifically, we employ the Tree-LSTM model and Tree-GRU model, which is based on the tree structure, to encode the arguments in a relation. And we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks. Experimental results show that our method achieves state-of-the-art performance on PDTB corpus.

pdf bib
Cascading Multiway Attentions for Document-level Sentiment Classification
Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang | Xu Sun
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level sentiment classification aims to assign the user reviews a sentiment polarity. Previous methods either just utilized the document content without consideration of user and product information, or did not comprehensively consider what roles the three kinds of information play in text modeling. In this paper, to reasonably use all the information, we present the idea that user, product and their combination can all influence the generation of attentions to words and sentences, when judging the sentiment of a document. With this idea, we propose a cascading multiway attention (CMA) model, where multiple ways of using user and product information are cascaded to influence the generation of attentions on the word and sentence layers. Then, sentences and documents are well modeled by multiple representation vectors, which provide rich information for sentiment classification. Experiments on IMDB and Yelp datasets demonstrate the effectiveness of our model.

pdf bib
Learning to Rank Semantic Coherence for Topic Segmentation
Liang Wang | Sujian Li | Yajuan Lv | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Topic segmentation plays an important role for discourse parsing and information retrieval. Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for topic segmentation. In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. With the training corpus, we design a symmetric CNN neural network to model text pairs and rank the semantic coherence within the learning to rank framework. Experiments show that our algorithm is able to achieve competitive performance over strong baselines on several real-world datasets.

pdf bib
Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective
Qing Zhang | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

For the task of relation extraction, distant supervision is an efficient approach to generate labeled data by aligning knowledge base with free texts. The essence of it is a challenging incomplete multi-label classification problem with sparse and noisy features. To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task. Experiment results show substantially higher top precision improvements over the traditional state-of-the-art approaches.

pdf bib
A Two-Stage Parsing Method for Text-Level Discourse Analysis
Yizhong Wang | Sujian Li | Houfeng Wang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. In this paper, we propose that transition-based model is more appropriate for parsing the naked discourse tree (i.e., identifying span and nuclearity) due to data sparsity. At the same time, we argue that relation labeling can benefit from naked tree structure and should be treated elaborately with consideration of three kinds of relations including within-sentence, across-sentence and across-paragraph relations. Thus, we design a pipelined two-stage parsing method for generating an RST tree from text. Experimental results show that our method achieves state-of-the-art performance, especially on span and nuclearity identification.

pdf bib
Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization
Shuming Ma | Xu Sun | Jingjing Xu | Houfeng Wang | Wenjie Li | Qi Su
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Current Chinese social media text summarization models are based on an encoder-decoder framework. Although its generated summaries are similar to source texts literally, they have low semantic relevance. In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries. In our model, the source text is represented by a gated attention encoder, while the summary representation is produced by a decoder. Besides, the similarity score between the representations is maximized during training. Our experiments show that the proposed model outperforms baseline systems on a social media corpus.

2016

pdf bib
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)
Deyi Xiong | Kevin Duh | Eneko Agirre | Nora Aranberri | Houfeng Wang
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)

pdf bib
Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis
Shen Huang | Houfeng Wang
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify the error types and to locate the error positions. In the corpora of this year’s shared task, there can be multiple errors in a single offset of a sentence, to address which, we simutaneously train three Bi-LSTM models sharing word embeddings which label Missing, Redundant and Selection errors respectively. We regard word ordering error as a special kind of word selection error which is longer during training phase, and then separate them by length during testing phase. In NLP-TEA 3 shared task for Chinese Grammatical Error Diagnosis(CGED), Our system achieved relatively high F1 for all the three levels in the traditional Chinese track and for the detection level in the Simpified Chinese track.

pdf bib
Bidirectional Recurrent Convolutional Neural Network for Relation Classification
Rui Cai | Xiaodong Zhang | Houfeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Knowledge-Based Semantic Embedding for Machine Translation
Chen Shi | Shujie Liu | Shuo Ren | Shi Feng | Mu Li | Ming Zhou | Xu Sun | Houfeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Multi-label Text Categorization with Joint Learning Predictions-as-Features Method
Li Li | Houfeng Wang | Xu Sun | Baobao Chang | Shi Zhao | Lei Sha
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Dependency-Based Neural Network for Relation Classification
Yang Liu | Furu Wei | Sujian Li | Heng Ji | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Learning Summary Prior Representation for Extractive Summarization
Ziqiang Cao | Furu Wei | Sujian Li | Wenjie Li | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
A Unified Framework for Grammar Error Correction
Longkai Zhang | Houfeng Wang
Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
Go Climb a Dependency Tree and Correct the Grammatical Errors
Longkai Zhang | Houfeng Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints
Longkai Zhang | Li Li | Houfeng Wang | Xu Sun
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Muli-label Text Categorization with Hidden Components
Li Li | Longkai Zhang | Houfeng Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction
Longkai Zhang | Houfeng Wang | Xu Sun
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Collaborative Topic Regression with Multiple Graphs Factorization for Recommendation in Social Media
Qing Zhang | Houfeng Wang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Multi-view Chinese Treebanking
Likun Qiu | Yue Zhang | Peng Jin | Houfeng Wang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Feature-Frequency–Adaptive On-line Training for Fast and Accurate Natural Language Processing
Xu Sun | Wenjie Li | Houfeng Wang | Qin Lu
Computational Linguistics, Volume 40, Issue 3 - September 2014

2013

pdf bib
Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation
Longkai Zhang | Houfeng Wang | Xu Sun | Mairgup Mansur
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Efficient Collective Entity Linking with Stacking
Zhengyan He | Shujie Liu | Yang Song | Mu Li | Ming Zhou | Houfeng Wang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Entity Representation for Entity Disambiguation
Zhengyan He | Shujie Liu | Mu Li | Ming Zhou | Longkai Zhang | Houfeng Wang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Improving Chinese Word Segmentation on Micro-blog Using Rich Punctuations
Longkai Zhang | Li Li | Zhengyan He | Houfeng Wang | Ni Sun
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search
Xu Sun | Wenjie Li | Fanqi Meng | Houfeng Wang
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection
Xu Sun | Houfeng Wang | Wenjie Li
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Cross-Lingual Mixture Model for Sentiment Classification
Xinfan Meng | Furu Wei | Xiaohua Liu | Ming Zhou | Ge Xu | Houfeng Wang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff
Zhengyan He | Houfeng Wang | Sujian Li
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Comparison and Improvement of Online Learning Algorithms for Sequence Labeling
Zhengyan He | Houfeng Wang
Proceedings of COLING 2012

pdf bib
Constructing Chinese Abbreviation Dictionary: A Stacked Approach
Longkai Zhang | Sujian Li | Houfeng Wang | Ni Sun | Xinfan Meng
Proceedings of COLING 2012

pdf bib
Lost in Translations? Building Sentiment Lexicons using Context Based Machine Translation
Xinfan Meng | Furu Wei | Ge Xu | Longkai Zhang | Xiaohua Liu | Ming Zhou | Houfeng Wang
Proceedings of COLING 2012: Posters

pdf bib
Joint Learning for Coreference Resolution with Markov Logic
Yang Song | Jing Jiang | Wayne Xin Zhao | Sujian Li | Houfeng Wang
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Link Type Based Pre-Cluster Pair Model for Coreference Resolution
Yang Song | Houfeng Wang | Jing Jiang
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

2010

pdf bib
A Pipeline Approach to Chinese Personal Name Disambiguation
Yang Song | Zhengyan He | Chen Chen | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Applying Spectral Clustering for Chinese Word Sense Induction
Zhengyan He | Yang Song | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Build Chinese Emotion Lexicons Using A Graph-based Algorithm and Multiple Resources
Ge Xu | Xinfan Meng | Houfeng Wang
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Mining User Reviews: from Specification to Summarization
Xinfan Meng | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Clustering Technique in Multi-Document Personal Name Disambiguation
Chen Chen | Junfeng Hu | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop

2008

pdf bib
Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing
Bo Wang | Houfeng Wang
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Chinese Named Entity Recognition and Word Segmentation Based on Character
Jingzhou He | Houfeng Wang
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2003

pdf bib
News-Oriented Keyword Indexing with Maximum Entropy Principle
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation

pdf bib
News-Oriented Automatic Chinese Keyword Indexing
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing