Qiaoming Zhu

Also published as: Qiao-Ming Zhu, Qiao-ming Zhu, QiaoMing Zhu


2024

pdf bib
Improving Multi-party Dialogue Generation via Topic and Rhetorical Coherence
Yaxin Fan | Peifeng Li | Qiaoming Zhu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Previous studies on multi-party dialogue generation predominantly concentrated on modeling the reply-to structure of dialogue histories, always overlooking the coherence between generated responses and target utterances. To address this issue, we propose a Reinforcement Learning approach emphasizing both Topic and Rhetorical Coherence (RL-TRC). In particular, the topic- and rhetorical-coherence tasks are designed to enhance the model’s perception of coherence with the target utterance. Subsequently, an agent is employed to learn a coherence policy, which guides the generation of responses that are topically and rhetorically aligned with the target utterance. Furthermore, three discourse-aware rewards are developed to assess the coherence between the generated response and the target utterance, with the objective of optimizing the policy. The experimental results and in-depth analyses on two popular datasets demonstrate that our RL-TRC significantly outperforms the state-of-the-art baselines, particularly in generating responses that are more coherent with the target utterances.

pdf bib
Incomplete Utterance Rewriting with Editing Operation Guidance and Utterance Augmentation
Zhiyu Cao | Peifeng Li | Yaxin Fan | Qiaoming Zhu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Although existing fashionable generation methods on Incomplete Utterance Rewriting (IUR) can generate coherent utterances, they often result in the inclusion of irrelevant and redundant tokens in rewritten utterances due to their inability to focus on critical tokens in dialogue context. Furthermore, the limited size of the training datasets also contributes to the insufficient training of the IUR model. To address the first issue, we propose a multi-task learning framework EO-IUR (Editing Operation-guided Incomplete Utterance Rewriting) that introduces the editing operation labels generated by sequence labeling module to guide generation model to focus on critical tokens. Furthermore, we introduce a token-level heterogeneous graph to represent dialogues. To address the second issue, we propose a two-dimensional utterance augmentation strategy, namely editing operation-based incomplete utterance augmentation and LLM-based historical utterance augmentation. The experimental results on three datasets demonstrate that our EO-IUR outperforms previous state-of-the-art (SOTA) baselines in both open-domain and task-oriented dialogue.

pdf bib
Advancing Topic Segmentation and Outline Generation in Chinese Texts: The Paragraph-level Topic Representation, Corpus, and Benchmark
Feng Jiang | Weihao Liu | Xiaomin Chu | Peifeng Li | Qiaoming Zhu | Haizhou Li
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Topic segmentation and outline generation strive to divide a document into coherent topic sections and generate corresponding subheadings, unveiling the discourse topic structure of a document. Compared with sentence-level topic structure, the paragraph-level topic structure can quickly grasp and understand the overall context of the document from a higher level, benefitting many downstream tasks such as summarization, discourse parsing, and information retrieval. However, the lack of large-scale, high-quality Chinese paragraph-level topic structure corpora restrained relative research and applications. To fill this gap, we build the Chinese paragraph-level topic representation, corpus, and benchmark in this paper. Firstly, we propose a hierarchical paragraph-level topic structure representation with three layers to guide the corpus construction. Then, we employ a two-stage man-machine collaborative annotation method to construct the largest Chinese Paragraph-level Topic Structure corpus (CPTS), achieving high quality. We also build several strong baselines, including ChatGPT, to validate the computability of CPTS on two fundamental tasks (topic segmentation and outline generation) and preliminarily verified its usefulness for the downstream task (discourse parsing).

2023

pdf bib
Factual Relation Discrimination for Factuality-oriented Abstractive Summarization
Zhiguang Gao | Peifeng Li | Feng Jiang | Xiaomin Chu | Qiaoming Zhu
Findings of the Association for Computational Linguistics: EMNLP 2023

Most neural abstractive summarization models are capable of producing high-quality summaries. However, they still frequently contain factual errors. Existing factuality-oriented abstractive summarization models only consider the integration of factual information and ignore the causes of factual errors. To address this issue, we propose a factuality-oriented abstractive summarization model DASum, which is based on a new task factual relation discrimination that is able to identify the causes of factual errors. First, we use data augmentation methods to construct counterfactual summaries (i. e., negative samples), and build a factual summarization dataset. Then, we propose the factual relation discrimination task, which determines the factuality of the dependency relations in summaries during summary generation and guides our DASum to generate factual relations, thereby improving the factuality of summaries. Experimental results on the CNN/DM and XSUM datasets show that our DASum outperforms several state-of-the-art benchmarks in terms of the factual metrics.

pdf bib
Cross-Document Event Coreference Resolution on Discourse Structure
Xinyu Chen | Sheng Xu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Cross-document event coreference resolution (CD-ECR) is a task of clustering event mentions across multiple documents that refer to the same real-world events. Previous studies usually model the CD-ECR task as a pairwise similarity comparison problem by using different event mention features, and consider the highly similar event mention pairs in the same cluster as coreferent. In general, most of them only consider the local context of event mentions and ignore their implicit global information, thus failing to capture the interactions of long-distance event mentions. To address the above issue, we regard discourse structure as global information to further improve CD-ECR. First, we use a discourse rhetorical structure constructor to construct tree structures to represent documents. Then, we obtain shortest dependency paths from the tree structures to represent interactions between event mention pairs. Finally, we feed the above information to a multi-layer perceptron to capture the similarities of event mention pairs for resolving coreferent events. Experimental results on the ECB+ dataset show that our proposed model outperforms several baselines and achieves the competitive performance with the start-of-the-art baselines.

pdf bib
Improving Dialogue Discourse Parsing via Reply-to Structures of Addressee Recognition
Yaxin Fan | Feng Jiang | Peifeng Li | Fang Kong | Qiaoming Zhu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Dialogue discourse parsing aims to reflect the relation-based structure of dialogue by establishing discourse links according to discourse relations. To alleviate data sparsity, previous studies have adopted multitasking approaches to jointly learn dialogue discourse parsing with related tasks (e.g., reading comprehension) that require additional human annotation, thus limiting their generality. In this paper, we propose a multitasking framework that integrates dialogue discourse parsing with its neighboring task addressee recognition. Addressee recognition reveals the reply-to structure that partially overlaps with the relation-based structure, which can be exploited to facilitate relation-based structure learning. To this end, we first proposed a reinforcement learning agent to identify training examples from addressee recognition that are most helpful for dialog discourse parsing. Then, a task-aware structure transformer is designed to capture the shared and private dialogue structure of different tasks, thereby further promoting dialogue discourse parsing. Experimental results on both the Molweni and STAC datasets show that our proposed method can outperform the SOTA baselines. The code will be available at https://github.com/yxfanSuda/RLTST.

pdf bib
CorefPrompt: Prompt-based Event Coreference Resolution by Measuring Event Type and Argument Compatibilities
Sheng Xu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Event coreference resolution (ECR) aims to group event mentions referring to the same real-world event into clusters. Most previous studies adopt the “encoding first, then scoring” framework, making the coreference judgment rely on event encoding. Furthermore, current methods struggle to leverage human-summarized ECR rules, e.g., coreferential events should have the same event type, to guide the model. To address these two issues, we propose a prompt-based approach, CorefPrompt, to transform ECR into a cloze-style MLM (masked language model) task. This allows for simultaneous event modeling and coreference discrimination within a single template, with a fully shared context. In addition, we introduce two auxiliary prompt tasks, event-type compatibility and argument compatibility, to explicitly demonstrate the reasoning process of ECR, which helps the model make final predictions. Experimental results show that our method CorefPrompt performs well in a state-of-the-art (SOTA) benchmark.

2022

pdf bib
Improving Event Coreference Resolution Using Document-level and Topic-level Information
Sheng Xu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Event coreference resolution (ECR) aims to cluster event mentions that refer to the same real-world events. Deep learning methods have achieved SOTA results on the ECR task. However, due to the encoding length limitation, previous methods either adopt classical pairwise models based on sentence-level context or split each document into multiple chunks and encode them separately. They failed to capture the interactions and contextual cues among those long-distance event mentions. Besides, high-level information, such as event topics, is rarely considered to enhance representation learning for ECR. To address the above two issues, we first apply a Longformer-based encoder to obtain the document-level embeddings and an encoder with a trigger-mask mechanism to learn sentence-level embeddings based on local context. In addition, we propose an event topic generator to infer the latent topic-level representations. Finally, using the above event embeddings, we employ a multiple tensor matching method to capture their interactions at the document, sentence, and topic levels. Experimental results on the KBP 2017 dataset show that our model outperforms the SOTA baselines.

pdf bib
A Distance-Aware Multi-Task Framework for Conversational Discourse Parsing
Yaxin Fan | Peifeng Li | Fang Kong | Qiaoming Zhu
Proceedings of the 29th International Conference on Computational Linguistics

Conversational discourse parsing aims to construct an implicit utterance dependency tree to reflect the turn-taking in a multi-party conversation. Existing works are generally divided into two lines: graph-based and transition-based paradigms, which perform well for short-distance and long-distance dependency links, respectively. However, there is no study to consider the advantages of both paradigms to facilitate conversational discourse parsing. As a result, we propose a distance-aware multi-task framework DAMT that incorporates the strengths of transition-based paradigm to facilitate the graph-based paradigm from the encoding and decoding process. To promote multi-task learning on two paradigms, we first introduce an Encoding Interactive Module (EIM) to enhance the flow of semantic information between both two paradigms during the encoding step. And then we apply a Distance-Aware Graph Convolutional Network (DAGCN) in the decoding process, which can incorporate the different-distance dependency links predicted by the transition-based paradigm to facilitate the decoding of the graph-based paradigm. The experimental results on the datasets STAC and Molweni show that our method can significantly improve the performance of the SOTA graph-based paradigm on long-distance dependency links.

pdf bib
A Hybrid Model of Classification and Generation for Spatial Relation Extraction
Feng Wang | Peifeng Li | Qiaoming Zhu
Proceedings of the 29th International Conference on Computational Linguistics

Extracting spatial relations from texts is a fundamental task for natural language understanding and previous studies only regard it as a classification task, ignoring those spatial relations with null roles due to their poor information. To address the above issue, we first view spatial relation extraction as a generation task and propose a novel hybrid model HMCGR for this task. HMCGR contains a generation and a classification model, while the former can generate those null-role relations and the latter can extract those non-null-role relations to complement each other. Moreover, a reflexivity evaluation mechanism is applied to further improve the accuracy based on the reflexivity principle of spatial relation. Experimental results on SpaceEval show that HMCGR outperforms the SOTA baselines significantly.

pdf bib
Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning
Zhong Qian | Heng Zhang | Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 29th International Conference on Computational Linguistics

Document-level Event Factuality Identification (DEFI) predicts the factuality of a specific event based on a document from which the event can be derived, which is a fundamental and crucial task in Natural Language Processing (NLP). However, most previous studies only considered sentence-level task and did not adopt document-level knowledge. Moreover, they modelled DEFI as a typical text classification task depending on annotated information heavily, and limited to the task-specific corpus only, which resulted in data scarcity. To tackle these issues, we propose a new framework formulating DEFI as Machine Reading Comprehension (MRC) tasks considering both Span-Extraction (Ext) and Multiple-Choice (Mch). Our model does not employ any other explicit annotated information, and utilizes Transfer Learning (TL) to extract knowledge from universal large-scale MRC corpora for cross-domain data augmentation. The empirical results on DLEFM corpus demonstrate that the proposed model outperforms several state-of-the-arts.

2021

pdf bib
More than Text: Multi-modal Chinese Word Segmentation
Dong Zhang | Zheng Hu | Shoushan Li | Hanqian Wu | Qiaoming Zhu | Guodong Zhou
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Chinese word segmentation (CWS) is undoubtedly an important basic task in natural language processing. Previous works only focus on the textual modality, but there are often audio and video utterances (such as news broadcast and face-to-face dialogues), where textual, acoustic and visual modalities normally exist. To this end, we attempt to combine the multi-modality (mainly the converted text and actual voice information) to perform CWS. In this paper, we annotate a new dataset for CWS containing text and audio. Moreover, we propose a time-dependent multi-modal interactive model based on Transformer framework to integrate multi-modal information for word sequence labeling. The experimental results on three different training sets show the effectiveness of our approach with fusing text and audio.

pdf bib
Winnowing Knowledge for Multi-choice Question Answering
Yeqiu Li | Bowei Zou | Zhifeng Li | Ai Ti Aw | Yu Hong | Qiaoming Zhu
Findings of the Association for Computational Linguistics: EMNLP 2021

We tackle multi-choice question answering. Acquiring related commonsense knowledge to the question and options facilitates the recognition of the correct answer. However, the current reasoning models suffer from the noises in the retrieved knowledge. In this paper, we propose a novel encoding method which is able to conduct interception and soft filtering. This contributes to the harvesting and absorption of representative information with less interference from noises. We experiment on CommonsenseQA. Experimental results illustrate that our method yields substantial and consistent improvements compared to the strong Bert, RoBERTa and Albert-based baselines.

pdf bib
Not Just Classification: Recognizing Implicit Discourse Relation on Joint Modeling of Classification and Generation
Feng Jiang | Yaxin Fan | Xiaomin Chu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Implicit discourse relation recognition (IDRR) is a critical task in discourse analysis. Previous studies only regard it as a classification task and lack an in-depth understanding of the semantics of different relations. Therefore, we first view IDRR as a generation task and further propose a method joint modeling of the classification and generation. Specifically, we propose a joint model, CG-T5, to recognize the relation label and generate the target sentence containing the meaning of relations simultaneously. Furthermore, we design three target sentence forms, including the question form, for the generation model to incorporate prior knowledge. To address the issue that large discourse units are hardly embedded into the target sentence, we also propose a target sentence construction mechanism that automatically extracts core sentences from those large discourse units. Experimental results both on Chinese MCDTB and English PDTB datasets show that our model CG-T5 achieves the best performance against several state-of-the-art systems.

2020

pdf bib
Chinese Paragraph-level Discourse Parsing with Global Backward and Local Reverse Reading
Feng Jiang | Xiaomin Chu | Peifeng Li | Fang Kong | Qiaoming Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Discourse structure tree construction is the fundamental task of discourse parsing and most previous work focused on English. Due to the cultural and linguistic differences, existing successful methods on English discourse parsing cannot be transformed into Chinese directly, especially in paragraph level suffering from longer discourse units and fewer explicit connectives. To alleviate the above issues, we propose two reading modes, i.e., the global backward reading and the local reverse reading, to construct Chinese paragraph level discourse trees. The former processes discourse units from the end to the beginning in a document to utilize the left-branching bias of discourse structure in Chinese, while the latter reverses the position of paragraphs in a discourse unit to enhance the differentiation of coherence between adjacent discourse units. The experimental results on Chinese MCDTB demonstrate that our model outperforms all strong baselines.

pdf bib
融合全局和局部信息的汉语宏观篇章结构识别(Combining Global and Local Information to Recognize Chinese Macro Discourse Structure)
Yaxin Fan (范亚鑫) | Feng Jiang (蒋峰) | Xiaomin Chu (褚晓敏) | Peifeng Li (李培峰) | Qiaoming Zhu (朱巧明)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

作为宏观篇章分析中的基础任务,篇章结构识别任务的目的是识别相邻篇章单元之间的结构,并层次化构建篇章结构树。已有的工作只考虑局部的结构和语义信息或只考虑全局信息。因此,本文提出了一种融合全局和局部信息的指针网络模型,该模型在考虑全局的语义信息同时,又考虑局部段落间的语义关系密切程度,从而有效地提高宏观篇章结构识别的能力。在汉语宏观篇章树库(MCDTB)的实验结果表明,本文所提出的模型性能优于目前性能最好的模型。

pdf bib
基于阅读理解框架的中文事件论元抽取(Chinese Event Argument Extraction using Reading Comprehension Framework)
Min Chen (陈敏) | Fan Wu (吴凡) | Zhongqing Wang (王中卿) | Peifeng Li (李培峰) | Qiaoming Zhu (朱巧明)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

传统的事件论元抽取方法把该任务当作句子中实体提及的多分类或序列标注任务,论元角色的类别在这些方法中只能作为向量表示,而忽略了论元角色的先验信息。实际上,论元角色的语义和论元本身有很大关系。对此,本文提议将其当作机器阅读理解任务,把论元角色表述为自然语言描述的问题,通过在上下文中回答这些问题来抽取论元。该方法更好地利用了论元角色类别的先验信息,在ACE2005中文语料上的实验证明了该方法的有效性。

pdf bib
基于半监督学习的中文社交文本事件聚类方法(Semi-supervised Method to Cluster Chinese Events on Social Streams)
Hengrui Guo (郭恒睿) | Zhongqing Wang (王中卿) | Peifeng Li (李培峰) | Qiaoming Zhu (朱巧明)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

面向社交媒体的事件聚类旨在根据事件特征对短文本聚类。目前,事件聚类模型主要分为无监督模型和有监督模型。无监督模型聚类效果较差,有监督模型依赖大量标注数据。基于此,本文提出了一种半监督事件聚类模型(SemiEC),该模型在小规模标注数据的基础上,利用LSTM表征事件,利用线性模型计算文本相似度,进行增量聚类,利用增量聚类产生的标注数据对模型再训练,结束后对不确定样本再聚类。实验表明,SemiEC的性能相比其他模型均有所提高。

pdf bib
Multi-modal Multi-label Emotion Detection with Modality and Label Dependence
Dong Zhang | Xincheng Ju | Junhui Li | Shoushan Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

As an important research issue in the natural language processing community, multi-label emotion detection has been drawing more and more attention in the last few years. However, almost all existing studies focus on one modality (e.g., textual modality). In this paper, we focus on multi-label emotion detection in a multi-modal scenario. In this scenario, we need to consider both the dependence among different labels (label dependence) and the dependence between each predicting label and different modalities (modality dependence). Particularly, we propose a multi-modal sequence-to-set approach to effectively model both kinds of dependence in multi-modal multi-label emotion detection. The detailed evaluation demonstrates the effectiveness of our approach.

2019

pdf bib
Topic Tensor Network for Implicit Discourse Relation Recognition in Chinese
Sheng Xu | Peifeng Li | Fang Kong | Qiaoming Zhu | Guodong Zhou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In the literature, most of the previous studies on English implicit discourse relation recognition only use sentence-level representations, which cannot provide enough semantic information in Chinese due to its unique paratactic characteristics. In this paper, we propose a topic tensor network to recognize Chinese implicit discourse relations with both sentence-level and topic-level representations. In particular, besides encoding arguments (discourse units) using a gated convolutional network to obtain sentence-level representations, we train a simplified topic model to infer the latent topic-level representations. Moreover, we feed the two pairs of representations to two factored tensor networks, respectively, to capture both the sentence-level interactions and topic-level relevance using multi-slice tensors. Experimentation on CDTB, a Chinese discourse corpus, shows that our proposed model significantly outperforms several state-of-the-art baselines in both micro and macro F1-scores.

pdf bib
Document-Level Event Factuality Identification via Adversarial Neural Network
Zhong Qian | Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Document-level event factuality identification is an important subtask in event factuality and is crucial for discourse understanding in Natural Language Processing (NLP). Previous studies mainly suffer from the scarcity of suitable corpus and effective methods. To solve these two issues, we first construct a corpus annotated with both document- and sentence-level event factuality information on both English and Chinese texts. Then we present an LSTM neural network based on adversarial training with both intra- and inter-sequence attentions to identify document-level event factuality. Experimental results show that our neural network model can outperform various baselines on the constructed corpus.

pdf bib
Negative Focus Detection via Contextual Attention Mechanism
Longxiang Shen | Bowei Zou | Yu Hong | Guodong Zhou | Qiaoming Zhu | AiTi Aw
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Negation is a universal but complicated linguistic phenomenon, which has received considerable attention from the NLP community over the last decade, since a negated statement often carries both an explicit negative focus and implicit positive meanings. For the sake of understanding a negated statement, it is critical to precisely detect the negative focus in context. However, how to capture contextual information for negative focus detection is still an open challenge. To well address this, we come up with an attention-based neural network to model contextual information. In particular, we introduce a framework which consists of a Bidirectional Long Short-Term Memory (BiLSTM) neural network and a Conditional Random Fields (CRF) layer to effectively encode the order information and the long-range context dependency in a sentence. Moreover, we design two types of attention mechanisms, word-level contextual attention and topic-level contextual attention, to take advantage of contextual information across sentences from both the word perspective and the topic perspective, respectively. Experimental results on the SEM’12 shared task corpus show that our approach achieves the best performance on negative focus detection, yielding an absolute improvement of 2.11% over the state-of-the-art. This demonstrates the great effectiveness of the two types of contextual attention mechanisms.

2018

pdf bib
Self-regulation: Employing a Generative Adversarial Network to Improve Event Detection
Yu Hong | Wenxuan Zhou | Jingli Zhang | Guodong Zhou | Qiaoming Zhu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Due to the ability of encoding and mapping semantic information into a high-dimensional latent feature space, neural networks have been successfully used for detecting events to a certain extent. However, such a feature space can be easily contaminated by spurious features inherent in event detection. In this paper, we propose a self-regulated learning approach by utilizing a generative adversarial network to generate spurious features. On the basis, we employ a recurrent network to eliminate the fakes. Detailed experiments on the ACE 2005 and TAC-KBP 2015 corpora show that our proposed method is highly effective and adaptable.

pdf bib
Employing Text Matching Network to Recognise Nuclearity in Chinese Discourse
Sheng Xu | Peifeng Li | Guodong Zhou | Qiaoming Zhu
Proceedings of the 27th International Conference on Computational Linguistics

The task of nuclearity recognition in Chinese discourse remains challenging due to the demand for more deep semantic information. In this paper, we propose a novel text matching network (TMN) that encodes the discourse units and the paragraphs by combining Bi-LSTM and CNN to capture both global dependency information and local n-gram information. Moreover, it introduces three components of text matching, the Cosine, Bilinear and Single Layer Network, to incorporate various similarities and interactions among the discourse units. Experimental results on the Chinese Discourse TreeBank show that our proposed TMN model significantly outperforms various strong baselines in both micro-F1 and macro-F1.

pdf bib
Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank
Xiaomin Chu | Feng Jiang | Yi Zhou | Guodong Zhou | Qiaoming Zhu
Proceedings of the 27th International Conference on Computational Linguistics

Discourse parsing is a challenging task and plays a critical role in discourse analysis. This paper focus on the macro level discourse structure analysis, which has been less studied in the previous researches. We explore a macro discourse structure presentation schema to present the macro level discourse structure, and propose a corresponding corpus, named Macro Chinese Discourse Treebank. On these bases, we concentrate on two tasks of macro discourse structure analysis, including structure identification and nuclearity recognition. In order to reduce the error transmission between the associated tasks, we adopt a joint model of the two tasks, and an Integer Linear Programming approach is proposed to achieve global optimization with various kinds of constraints.

pdf bib
Stance Detection with Hierarchical Attention Network
Qingying Sun | Zhongqing Wang | Qiaoming Zhu | Guodong Zhou
Proceedings of the 27th International Conference on Computational Linguistics

Stance detection aims to assign a stance label (for or against) to a post toward a specific target. Recently, there is a growing interest in using neural models to detect stance of documents. Most of these works model the sequence of words to learn document representation. However, much linguistic information, such as polarity and arguments of the document, is correlated with the stance of the document, and can inspire us to explore the stance. Hence, we present a neural model to fully employ various linguistic information to construct the document representation. In addition, since the influences of different linguistic information are different, we propose a hierarchical attention network to weigh the importance of various linguistic information, and learn the mutual attention between the document and the linguistic information. The experimental results on two datasets demonstrate the effectiveness of the proposed hierarchical attention neural model.

pdf bib
MCDTB: A Macro-level Chinese Discourse TreeBank
Feng Jiang | Sheng Xu | Xiaomin Chu | Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 27th International Conference on Computational Linguistics

In view of the differences between the annotations of micro and macro discourse rela-tionships, this paper describes the relevant experiments on the construction of the Macro Chinese Discourse Treebank (MCDTB), a higher-level Chinese discourse corpus. Fol-lowing RST (Rhetorical Structure Theory), we annotate the macro discourse information, including discourse structure, nuclearity and relationship, and the additional discourse information, including topic sentences, lead and abstract, to make the macro discourse annotation more objective and accurate. Finally, we annotated 720 articles with a Kappa value greater than 0.6. Preliminary experiments on this corpus verify the computability of MCDTB.

pdf bib
Building a Macro Chinese Discourse Treebank
Xiaomin Chu | Feng Jiang | Sheng Xu | Qiaoming Zhu
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Speculation and Negation Scope Detection via Convolutional Neural Networks
Zhong Qian | Peifeng Li | Qiaoming Zhu | Guodong Zhou | Zhunchen Luo | Wei Luo
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Global Inference to Chinese Temporal Relation Extraction
Peifeng Li | Qiaoming Zhu | Guodong Zhou | Hongling Wang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Previous studies on temporal relation extraction focus on mining sentence-level information or enforcing coherence on different temporal relation types among various event mentions in the same sentence or neighboring sentences, largely ignoring those discourse-level temporal relations in nonadjacent sentences. In this paper, we propose a discourse-level global inference model to mine those temporal relations between event mentions in document-level, especially in nonadjacent sentences. Moreover, we provide various kinds of discourse-level constraints, which derived from event semantics, to further improve our global inference model. Evaluation on a Chinese corpus justifies the effectiveness of our discourse-level global inference model over two strong baselines.

2015

pdf bib
Unsupervised Negation Focus Identification with Word-Topic Graph Model
Bowei Zou | Guodong Zhou | Qiaoming Zhu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Negation and Speculation Identification in Chinese Language
Bowei Zou | Qiaoming Zhu | Guodong Zhou
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Negation Focus Identification with Contextual Discourse Information
Bowei Zou | Guodong Zhou | Qiaoming Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
Longhua Qian | Haotian Hui | Ya’nan Hu | Guodong Zhou | Qiaoming Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Synchronous Constituent Context Model for Inducing Bilingual Synchronous Structures
Xiangyu Duan | Min Zhang | Qiaoming Zhu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Employing Event Inference to Improve Semi-Supervised Chinese Event Extraction
Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Tree Kernel-based Negation and Speculation Scope Detection with Structured Syntactic Parse Features
Bowei Zou | Guodong Zhou | Qiaoming Zhu
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Joint Modeling of Trigger Identification and Event Type Determination in Chinese Event Extraction
Peifeng Li | Qiaoming Zhu | Hongjun Diao | Guodong Zhou
Proceedings of COLING 2012

pdf bib
Bilingual Lexicon Construction from Comparable Corpora via Dependency Mapping
Longhua Qian | Hongling Wang | Guodong Zhou | Qiaoming Zhu
Proceedings of COLING 2012

pdf bib
A Unified Framework for Discourse Argument Identification via Shallow Semantic Parsing
Fan Xu | Qiaoming Zhu | Guodong Zhou
Proceedings of COLING 2012: Posters

pdf bib
Employing Compositional Semantics and Discourse Consistency in Chinese Event Extraction
Peifeng Li | Guodong Zhou | Qiaoming Zhu | Libin Hou
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Using Cross-Entity Inference to Improve Event Extraction
Yu Hong | Jianfeng Zhang | Bin Ma | Jianmin Yao | Guodong Zhou | Qiaoming Zhu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Using Context Inference to Improve Sentence Ordering for Multi-document Summarization
Peifeng Li | Guangxi Deng | Qiaoming Zhu
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
A Unified Framework for Scope Learning via Simplified Shallow Semantic Parsing
Qiaoming Zhu | Junhui Li | Hongling Wang | Guodong Zhou
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Jumping Distance based Chinese Person Name Disambiguation
Yu Hong | Fei Pei | Yue-hui Yang | Jian-min Yao | Qiao-ming Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Dependency-driven Anaphoricity Determination for Coreference Resolution
Fang Kong | Guodong Zhou | Longhua Qian | Qiaoming Zhu
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Learning the Scope of Negation via Shallow Semantic Parsing
Junhui Li | Guodong Zhou | Hongling Wang | Qiaoming Zhu
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
A Novel Method for Bilingual Web Page Acquisition from Search Engine Web Records
Yanhui Feng | Yu Hong | Zhenxiang Yan | Jianmin Yao | Qiaoming Zhu
Coling 2010: Posters

pdf bib
Negative Feedback: The Forsaken Nature Available for Re-ranking
Yu Hong | Qing-qing Cai | Song Hua | Jian-min Yao | Qiao-ming Zhu
Coling 2010: Posters

2009

pdf bib
Employing the Centering Theory in Pronoun Resolution from the Semantic Perspective
Fang Kong | GuoDong Zhou | Qiaoming Zhu
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improving Nominal SRL in Chinese Language with Verbal SRL Information and Automatic Predicate Recognition
Junhui Li | Guodong Zhou | Hai Zhao | Qiaoming Zhu | Peide Qian
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Semi-Supervised Learning for Semantic Relation Classification using Stratified Sampling Strategy
Longhua Qian | Guodong Zhou | Fang Kong | Qiaoming Zhu
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Dependency Tree-based SRL with Proper Pruning and Extensive Feature Engineering
Hongling Wang | Honglin Wang | Guodong Zhou | Qiaoming Zhu
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

pdf bib
Context-Sensitive Convolution Tree Kernel for Pronoun Resolution
GuoDong Zhou | Fang Kong | QiaoMing Zhu
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Semi-Supervised Learning for Relation Extraction
GuoDong Zhou | JunHui Li | LongHua Qian | QiaoMing Zhu
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Exploiting Constituent Dependencies for Tree Kernel-Based Semantic Relation Extraction
Longhua Qian | Guodong Zhou | Fang Kong | Qiaoming Zhu | Peide Qian
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Tree Kernel-Based Relation Extraction with Context-Sensitive Structured Parse Tree Information
GuoDong Zhou | Min Zhang | Dong Hong Ji | QiaoMing Zhu
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Performance Analysis and Visualization of Machine Translation Evaluation
Jianmin Yao | Yunqian Qu | Qiang Lv | Qiaoming Zhu | Jing Zhang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 3, September 2006: Special Issue on Selected Papers from ROCLING XVII

pdf bib
A Visualization method for machine translation evaluation results
Jian-Min Yao | Yun-Qian Qu | Qiao-Ming Zhu | Jing Zhang
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation