Yunfang Wu


2021

pdf bib
Asking Questions Like Educational Experts: Automatically Generating Question-Answer Pairs on Real-World Examination Data
Fanyi Qu | Xin Jia | Yunfang Wu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Generating high quality question-answer pairs is a hard but meaningful task. Although previous works have achieved great results on answer-aware question generation, it is difficult to apply them into practical application in the education field. This paper for the first time addresses the question-answer pair generation task on the real-world examination data, and proposes a new unified framework on RACE. To capture the important information of the input passage we first automatically generate (rather than extracting) keyphrases, thus this task is reduced to keyphrase-question-answer triplet joint generation. Accordingly, we propose a multi-agent communication model to generate and optimize the question and keyphrases iteratively, and then apply the generated question and keyphrases to guide the generation of answers. To establish a solid benchmark, we build our model on the strong generative pre-training model. Experimental results show that our model makes great breakthroughs in the question-answer pair generation task. Moreover, we make a comprehensive analysis on our model, suggesting new directions for this challenging task.

2020

pdf bib
How to Ask Good Questions? Try to Leverage Paraphrases
Xin Jia | Wenjie Zhou | Xu Sun | Yunfang Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Given a sentence and its relevant answer, how to ask good questions is a challenging task, which has many real applications. Inspired by human’s paraphrasing capability to ask questions of the same meaning but with diverse expressions, we propose to incorporate paraphrase knowledge into question generation(QG) to generate human-like questions. Specifically, we present a two-hand hybrid model leveraging a self-built paraphrase resource, which is automatically conducted by a simple back-translation method. On the one hand, we conduct multi-task learning with sentence-level paraphrase generation (PG) as an auxiliary task to supplement paraphrase knowledge to the task-share encoder. On the other hand, we adopt a new loss function for diversity training to introduce more question patterns to QG. Extensive experimental results show that our proposed model obtains obvious performance gain over several strong baselines, and further human evaluation validates that our model can ask questions of high quality by leveraging paraphrase knowledge.

pdf bib
Exploiting WordNet Synset and Hypernym Representations for Answer Selection
Weikang Li | Yunfang Wu
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Answer selection (AS) is an important subtask of document-based question answering (DQA). In this task, the candidate answers come from the same document, and each answer sentence is semantically related to the given question, which makes it more challenging to select the true answer. WordNet provides powerful knowledge about concepts and their semantic relations so we employ WordNet to enrich the abilities of paraphrasing and reasoning of the network-based question answering model. Specifically, we exploit the synset and hypernym concepts to enrich the word representation and incorporate the similarity scores of two concepts that share the synset or hypernym relations into the attention mechanism. The proposed WordNet-enhanced hierarchical model (WEHM) consists of four modules, including WordNet-enhanced word representation, sentence encoding, WordNet-enhanced attention mechanism, and hierarchical document encoding. Extensive experiments on the public WikiQA and SelQA datasets demonstrate that our proposed model significantly improves the baseline system and outperforms all existing state-of-the-art methods by a large margin.

pdf bib
A Question Type Driven and Copy Loss Enhanced Frameworkfor Answer-Agnostic Neural Question Generation
Xiuyu Wu | Nan Jiang | Yunfang Wu
Proceedings of the Fourth Workshop on Neural Generation and Translation

The answer-agnostic question generation is a significant and challenging task, which aims to automatically generate questions for a given sentence but without an answer. In this paper, we propose two new strategies to deal with this task: question type prediction and copy loss mechanism. The question type module is to predict the types of questions that should be asked, which allows our model to generate multiple types of questions for the same source sentence. The new copy loss enhances the original copy mechanism to make sure that every important word in the source sentence has been copied when generating questions. Our integrated model outperforms the state-of-the-art approach in answer-agnostic question generation, achieving a BLEU-4 score of 13.9 on SQuAD. Human evaluation further validates the high quality of our generated questions. We will make our code public available for further research.

2019

pdf bib
Multi-Task Learning with Language Modeling for Question Generation
Wenjie Zhou | Minghua Zhang | Yunfang Wu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper explores the task of answer-aware questions generation. Based on the attention-based pointer generator model, we propose to incorporate an auxiliary task of language modeling to help question generation in a hierarchical multi-task learning structure. Our joint-learning model enables the encoder to learn a better representation of the input sequence, which will guide the decoder to generate more coherent and fluent questions. On both SQuAD and MARCO datasets, our multi-task learning model boosts the performance, achieving state-of-the-art results. Moreover, human evaluation further proves the high quality of our generated questions.

pdf bib
Question-type Driven Question Generation
Wenjie Zhou | Minghua Zhang | Yunfang Wu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Question generation is a challenging task which aims to ask a question based on an answer and relevant context. The existing works suffer from the mismatching between question type and answer, i.e. generating a question with type how while the answer is a personal name. We propose to automatically predict the question type based on the input answer and context. Then, the question type is fused into a seq2seq model to guide the question generation, so as to deal with the mismatching problem. We achieve significant improvement on the accuracy of question type prediction and finally obtain state-of-the-art results for question generation on both SQuAD and MARCO datasets.

pdf bib
Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model
Wei Li | Jingjing Xu | Yancheng He | ShengLi Yan | Yunfang Wu | Xu Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Automatic article commenting is helpful in encouraging user engagement on online news platforms. However, the news documents are usually too long for models under traditional encoder-decoder frameworks, which often results in general and irrelevant comments. In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph. By organizing the article into graph structure, our model can better understand the internal structure of the article and the connection between topics, which makes it better able to generate coherent and informative comments. We collect and release a large scale news-comment corpus from a popular Chinese online news platform Tencent Kuaibao. Extensive experiment results show that our model can generate much more coherent and informative comments compared with several strong baseline models.

2018

pdf bib
Research on Entity Relation Extraction for Military Field
Chen Liang | Hongying Zan | Yajun Liu | Yunfang Wu
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

pdf bib
Learning Universal Sentence Representations with Mean-Max Attention Autoencoder
Minghua Zhang | Yunfang Wu | Weikang Li | Wei Li
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In order to learn universal sentence representations, previous methods focus on complex recurrent neural networks or supervised learning. In this paper, we propose a mean-max attention autoencoder (mean-max AAE) within the encoder-decoder framework. Our autoencoder rely entirely on the MultiHead self-attention mechanism to reconstruct the input sequence. In the encoding we propose a mean-max strategy that applies both mean and max pooling operations over the hidden vectors to capture diverse information of the input. To enable the information to steer the reconstruction process dynamically, the decoder performs attention over the mean-max representation. By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts and the advanced skip-thoughts+LN model. Furthermore, compared with the traditional recurrent neural network, our mean-max AAE greatly reduce the training time.

2016

pdf bib
ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System
Yunfang Wu | Minghua Zhang
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
Multi-level Gated Recurrent Neural Network for dialog act classification
Wei Li | Yunfang Wu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper we focus on the problem of dialog act (DA) labelling. This problem has recently attracted a lot of attention as it is an important sub-part of an automatic question answering system, which is currently in great demand. Traditional methods tend to see this problem as a sequence labelling task and deals with it by applying classifiers with rich features. Most of the current neural network models still omit the sequential information in the conversation. Henceforth, we apply a novel multi-level gated recurrent neural network (GRNN) with non-textual information to predict the DA tag. Our model not only utilizes textual information, but also makes use of non-textual and contextual information. In comparison, our model has shown significant improvement over previous works on Switchboard Dialog Act (SWDA) task by over 6%.

2012

pdf bib
SemEval-2012 Task 4: Evaluating Chinese Word Similarity
Peng Jin | Yunfang Wu
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Exploiting Discourse Relations for Sentiment Analysis
Fei Wang | Yunfang Wu | Likun Qiu
Proceedings of COLING 2012: Posters

2011

pdf bib
Mining the Sentiment Expectation of Nouns Using Bootstrapping Method
Miaomiao Wen | Yunfang Wu
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Disambiguating Dynamic Sentiment Ambiguous Adjectives
Yunfang Wu | Miaomiao Wen
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
SemEval-2010 Task 18: Disambiguating Sentiment Ambiguous Adjectives
Yunfang Wu | Peng Jin
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
SemEval-2 Task 15: Infrequent Sense Identification for Mandarin Text to Speech Systems
Peng Jin | Yunfang Wu
Proceedings of the 5th International Workshop on Semantic Evaluation

2007

pdf bib
SemEval-2007 Task 05: Multilingual Chinese-English Lexical Sample
Peng Jin | Yunfang Wu | Shiwen Yu
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
PKU: Combining Supervised Classifiers with Features Selection
Peng Jin | Danqing Zhu | Fuxin Li | Yunfang Wu
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Building Chinese Sense Annotated Corpus with the Help of Software Tools
Yunfang Wu | Peng Jin | Tao Guo | Shiwen Yu
Proceedings of the Linguistic Annotation Workshop

2005

pdf bib
双向考察和驗證:并列成分中心語的語義關係和CCD的名詞語義分類体系 (Bidirectional Investigation: The Semantic Relations between the Conjuncts and the Noun Taxonomy in CCD) [In Chinese]
Yunfang Wu | Sujian Li | Yun Li | Shiwen Yu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 4, December 2005: Special Issue on Selected Papers from CLSW-5

pdf bib
隱喻性成語的語義映射 (Semantic Mapping in Chinese Metaphorical Idioms) [In Chinese]
Yun Li | Sujian Li | Zhimin Wang | Yunfang Wu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 4, December 2005: Special Issue on Selected Papers from CLSW-5