Yubo Chen


2021

pdf bib
Classification, Extraction, and Normalization : CASIA_Unisound Team at the Social Media Mining for Health 2021 Shared Tasks
Tong Zhou | Zhucong Li | Zhen Gan | Baoli Zhang | Yubo Chen | Kun Niu | Jing Wan | Kang Liu | Jun Zhao | Yafei Shi | Weifeng Chong | Shengping Liu
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

This is the system description of the CASIA_Unisound team for Task 1, Task 7b, and Task 8 of the sixth Social Media Mining for Health Applications (SMM4H) shared task in 2021. Targeting on deal with two shared challenges, the colloquial text and the imbalance annotation, among those tasks, we apply a customized pre-trained language model and propose various training strategies. Experimental results show the effectiveness of our system. Moreover, we got an F1-score of 0.87 in task 8, which is the highest among all participates.

pdf bib
Probing into the Root: A Dataset for Reason Extraction of Structural Events from Financial Documents
Pei Chen | Kang Liu | Yubo Chen | Taifeng Wang | Jun Zhao
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This paper proposes a new task regarding event reason extraction from document-level texts. Unlike the previous causality detection task, we do not assign target events in the text, but only provide structural event descriptions, and such settings accord more with practice scenarios. Moreover, we annotate a large dataset FinReason for evaluation, which provides Reasons annotation for Financial events in company announcements. This task is challenging because the cases of multiple-events, multiple-reasons, and implicit-reasons are included. In total, FinReason contains 8,794 documents, 12,861 financial events and 11,006 reason spans. We also provide the performance of existing canonical methods in event extraction and machine reading comprehension on this task. The results show a 7 percentage point F1 score gap between the best model and human performance, and existing methods are far from resolving this problem.

pdf bib
Knowledge Guided Metric Learning for Few-Shot Text Classification
Dianbo Sui | Yubo Chen | Binjie Mao | Delai Qiu | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Humans can distinguish new categories very efficiently with few examples, largely due to the fact that human beings can leverage knowledge obtained from relevant tasks. However, deep learning based text classification model tends to struggle to achieve satisfactory performance when labeled data are scarce. Inspired by human intelligence, we propose to introduce external knowledge into few-shot learning to imitate human knowledge. A novel parameter generator network is investigated to this end, which is able to use the external knowledge to generate different metrics for different tasks. Armed with this network, similar tasks can use similar metrics while different tasks use different metrics. Through experiments, we demonstrate that our method outperforms the SoTA few-shot text classification models.

pdf bib
Jointly Extracting Explicit and Implicit Relational Triples with Reasoning Pattern Enhanced Binary Pointer Network
Yubo Chen | Yunqi Zhang | Changran Hu | Yongfeng Huang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Relational triple extraction is a crucial task for knowledge graph construction. Existing methods mainly focused on explicit relational triples that are directly expressed, but usually suffer from ignoring implicit triples that lack explicit expressions. This will lead to serious incompleteness of the constructed knowledge graphs. Fortunately, other triples in the sentence provide supplementary information for discovering entity pairs that may have implicit relations. Also, the relation types between the implicitly connected entity pairs can be identified with relational reasoning patterns in the real world. In this paper, we propose a unified framework to jointly extract explicit and implicit relational triples. To explore entity pairs that may be implicitly connected by relations, we propose a binary pointer network to extract overlapping relational triples relevant to each word sequentially and retain the information of previously extracted triples in an external memory. To infer the relation types of implicit relational triples, we propose to introduce real-world relational reasoning patterns in our model and capture these patterns with a relation network. We conduct experiments on several benchmark datasets, and the results prove the validity of our method.

pdf bib
A Large-Scale Chinese Multimodal NER Dataset with Speech Clues
Dianbo Sui | Zhengkun Tian | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we aim to explore an uncharted territory, which is Chinese multimodal named entity recognition (NER) with both textual and acoustic contents. To achieve this, we construct a large-scale human-annotated Chinese multimodal NER dataset, named CNERTA. Our corpus totally contains 42,987 annotated sentences accompanying by 71 hours of speech data. Based on this dataset, we propose a family of strong and representative baseline models, which can leverage textual features or multimodal features. Upon these baselines, to capture the natural monotonic alignment between the textual modality and the acoustic modality, we further propose a simple multimodal multitask model by introducing a speech-to-text alignment auxiliary task. Through extensive experiments, we observe that: (1) Progressive performance boosts as we move from unimodal to multimodal, verifying the necessity of integrating speech clues into Chinese NER. (2) Our proposed model yields state-of-the-art (SoTA) results on CNERTA, demonstrating its effectiveness. For further research, the annotated dataset is publicly available at http://github.com/DianboWork/CNERTA.

pdf bib
LearnDA: Learnable Knowledge-Guided Data Augmentation for Event Causality Identification
Xinyu Zuo | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Weihua Peng | Yuguang Chen
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Modern models for event causality identification (ECI) are mainly based on supervised learning, which are prone to the data lacking problem. Unfortunately, the existing NLP-related augmentation methods cannot directly produce available data required for this task. To solve the data lacking problem, we introduce a new approach to augment training data for event causality identification, by iteratively generating new examples and classifying event causality in a dual learning framework. On the one hand, our approach is knowledge guided, which can leverage existing knowledge bases to generate well-formed new sentences. On the other hand, our approach employs a dual mechanism, which is a learnable augmentation framework, and can interactively adjust the generation process to generate task-related sentences. Experimental results on two benchmarks EventStoryLine and Causal-TimeBank show that 1) our method can augment suitable task-related training data for ECI; 2) our method outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.5 and +2.1 points on F1 value respectively).

pdf bib
Knowledge-Enriched Event Causality Identification via Latent Structure Induction Networks
Pengfei Cao | Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao | Yuguang Chen | Weihua Peng
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Identifying causal relations of events is an important task in natural language processing area. However, the task is very challenging, because event causality is usually expressed in diverse forms that often lack explicit causal clues. Existing methods cannot handle well the problem, especially in the condition of lacking training data. Nonetheless, humans can make a correct judgement based on their background knowledge, including descriptive knowledge and relational knowledge. Inspired by it, we propose a novel Latent Structure Induction Network (LSIN) to incorporate the external structural knowledge into this task. Specifically, to make use of the descriptive knowledge, we devise a Descriptive Graph Induction module to obtain and encode the graph-structured descriptive knowledge. To leverage the relational knowledge, we propose a Relational Graph Induction module which is able to automatically learn a reasoning structure for event causality reasoning. Experimental results on two widely used datasets indicate that our approach significantly outperforms previous state-of-the-art methods.

pdf bib
Automatic ICD Coding via Interactive Shared Representation Networks with Self-distillation Mechanism
Tong Zhou | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Kun Niu | Weifeng Chong | Shengping Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The ICD coding task aims at assigning codes of the International Classification of Diseases in clinical notes. Since manual coding is very laborious and prone to errors, many methods have been proposed for the automatic ICD coding task. However, existing works either ignore the long-tail of code frequency or the noisy clinical notes. To address the above issues, we propose an Interactive Shared Representation Network with Self-Distillation Mechanism. Specifically, an interactive shared representation network targets building connections among codes while modeling the co-occurrence, consequently alleviating the long-tail problem. Moreover, to cope with the noisy text issue, we encourage the model to focus on the clinical note’s noteworthy part and extract valuable information through a self-distillation learning mechanism. Experimental results on two MIMIC datasets demonstrate the effectiveness of our method.

pdf bib
Document-level Event Extraction via Parallel Prediction Networks
Hang Yang | Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao | Taifeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level event extraction (DEE) is indispensable when events are described throughout a document. We argue that sentence-level extractors are ill-suited to the DEE task where event arguments always scatter across sentences and multiple events may co-exist in a document. It is a challenging task because it requires a holistic understanding of the document and an aggregated ability to assemble arguments across multiple sentences. In this paper, we propose an end-to-end model, which can extract structured events from a document in a parallel manner. Specifically, we first introduce a document-level encoder to obtain the document-aware representations. Then, a multi-granularity non-autoregressive decoder is used to generate events in parallel. Finally, to train the entire model, a matching loss function is proposed, which can bootstrap a global optimization. The empirical results on the widely used DEE dataset show that our approach significantly outperforms current state-of-the-art methods in the challenging DEE task. Code will be available at https://github.com/HangYang-NLP/DE-PPN.

pdf bib
CogIE: An Information Extraction Toolkit for Bridging Texts and CogNet
Zhuoran Jin | Yubo Chen | Dianbo Sui | Chenhao Wang | Zhipeng Xue | Jun Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

CogNet is a knowledge base that integrates three types of knowledge: linguistic knowledge, world knowledge and commonsense knowledge. In this paper, we propose an information extraction toolkit, called CogIE, which is a bridge connecting raw texts and CogNet. CogIE has three features: versatile, knowledge-grounded and extensible. First, CogIE is a versatile toolkit with a rich set of functional modules, including named entity recognition, entity typing, entity linking, relation extraction, event extraction and frame-semantic parsing. Second, as a knowledge-grounded toolkit, CogIE can ground the extracted facts to CogNet and leverage different types of knowledge to enrich extracted results. Third, for extensibility, owing to the design of three-tier architecture, CogIE is not only a plug-and-play toolkit for developers but also an extensible programming framework for researchers. We release an open-access online system to visually extract information from texts. Source code, datasets and pre-trained models are publicly available at GitHub, with a short instruction video.

pdf bib
Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement
Xinyu Zuo | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Weihua Peng | Yuguang Chen
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Named Entity Recognition via Noise Aware Training Mechanism with Data Filter
Xiusheng Huang | Yubo Chen | Shun Wu | Jun Zhao | Yuantao Xie | Weijian Sun
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
KnowDis: Knowledge Enhanced Data Augmentation for Event Causality Detection via Distant Supervision
Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 28th International Conference on Computational Linguistics

Modern models of event causality detection (ECD) are mainly based on supervised learning from small hand-labeled corpora. However, hand-labeled training data is expensive to produce, low coverage of causal expressions, and limited in size, which makes supervised methods hard to detect causal relations between events. To solve this data lacking problem, we investigate a data augmentation framework for ECD, dubbed as Knowledge Enhanced Distant Data Augmentation (KnowDis). Experimental results on two benchmark datasets EventStoryLine corpus and Causal-TimeBank show that 1) KnowDis can augment available training data assisted with the lexical and causal commonsense knowledge for ECD via distant supervision, and 2) our method outperforms previous methods by a large margin assisted with automatically labeled training data.

pdf bib
How Does Context Matter? On the Robustness of Event Detection with Context-Selective Mask Generalization
Jian Liu | Yubo Chen | Kang Liu | Yantao Jia | Zhicheng Sheng
Findings of the Association for Computational Linguistics: EMNLP 2020

Event detection (ED) aims to identify and classify event triggers in texts, which is a crucial subtask of event extraction (EE). Despite many advances in ED, the existing studies are typically centered on improving the overall performance of an ED model, which rarely consider the robustness of an ED model. This paper aims to fill this research gap by stressing the importance of robustness modeling in ED models. We first pinpoint three stark cases demonstrating the brittleness of the existing ED models. After analyzing the underlying reason, we propose a new training mechanism, called context-selective mask generalization for ED, which can effectively mine context-specific patterns for learning and robustify an ED model. The experimental results have confirmed the effectiveness of our model regarding defending against adversarial attacks, exploring unseen predicates, and tackling ambiguity cases. Moreover, a deeper analysis suggests that our approach can learn a complementary predictive bias with most ED models that use full context for feature learning.

pdf bib
HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu | Weifeng Chong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The International Classification of Diseases (ICD) provides a standardized way for classifying diseases, which endows each disease with a unique code. ICD coding aims to assign proper ICD codes to a medical record. Since manual coding is very laborious and prone to errors, many methods have been proposed for the automatic ICD coding task. However, most of existing methods independently predict each code, ignoring two important characteristics: Code Hierarchy and Code Co-occurrence. In this paper, we propose a Hyperbolic and Co-graph Representation method (HyperCore) to address the above problem. Specifically, we propose a hyperbolic representation method to leverage the code hierarchy. Moreover, we propose a graph convolutional network to utilize the code co-occurrence. Experimental results on two widely used datasets demonstrate that our proposed model outperforms previous state-of-the-art methods.

pdf bib
Clinical-Coder: Assigning Interpretable ICD-10 Codes to Chinese Clinical Notes
Pengfei Cao | Chenwei Yan | Xiangling Fu | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu | Weifeng Chong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

In this paper, we introduce Clinical-Coder, an online system aiming to assign ICD codes to Chinese clinical notes. ICD coding has been a research hotspot of clinical medicine, but the interpretability of prediction hinders its practical application. We exploit a Dilated Convolutional Attention network with N-gram Matching mechanism (DCANM) to capture semantic features for non-continuous words and continuous n-gram words, concentrating on explaining the reason why each ICD code to be predicted. The experiments demonstrate that our approach is effective and that our system is able to provide supporting information in clinical decision making.

pdf bib
Incremental Event Detection via Knowledge Consolidation Networks
Pengfei Cao | Yubo Chen | Jun Zhao | Taifeng Wang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Conventional approaches to event detection usually require a fixed set of pre-defined event types. Such a requirement is often challenged in real-world applications, as new events continually occur. Due to huge computation cost and storage budge, it is infeasible to store all previous data and re-train the model with all previous data and new data, every time new events arrive. We formulate such challenging scenarios as incremental event detection, which requires a model to learn new classes incrementally without performance degradation on previous classes. However, existing incremental learning methods cannot handle semantic ambiguity and training data imbalance problems between old and new classes in the task of incremental event detection. In this paper, we propose a Knowledge Consolidation Network (KCN) to address the above issues. Specifically, we devise two components, prototype enhanced retrospection and hierarchical distillation, to mitigate the adverse effects of semantic ambiguity and class imbalance, respectively. Experimental results demonstrate the effectiveness of the proposed method, outperforming the state-of-the-art model by 19% and 13.4% of whole F1 score on ACE benchmark and TAC KBP benchmark, respectively.

pdf bib
Event Extraction as Machine Reading Comprehension
Jian Liu | Yubo Chen | Kang Liu | Wei Bi | Xiaojiang Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Event extraction (EE) is a crucial information extraction task that aims to extract event information in texts. Previous methods for EE typically model it as a classification task, which are usually prone to the data scarcity problem. In this paper, we propose a new learning paradigm of EE, by explicitly casting it as a machine reading comprehension problem (MRC). Our approach includes an unsupervised question generation process, which can transfer event schema into a set of natural questions, followed by a BERT-based question-answering process to retrieve answers as EE results. This learning paradigm enables us to strengthen the reasoning process of EE, by introducing sophisticated models in MRC, and relieve the data scarcity problem, by introducing the large-scale datasets in MRC. The empirical results show that: i) our approach attains state-of-the-art performance by considerable margins over previous methods. ii) Our model is excelled in the data-scarce scenario, for example, obtaining 49.8% in F1 for event argument extraction with only 1% data, compared with 2.2% of the previous method. iii) Our model also fits with zero-shot scenarios, achieving 37.0% and 16% in F1 on two datasets without using any EE training data.

pdf bib
FedED: Federated Learning via Ensemble Distillation for Medical Relation Extraction
Dianbo Sui | Yubo Chen | Jun Zhao | Yantao Jia | Yuantao Xie | Weijian Sun
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Unlike other domains, medical texts are inevitably accompanied by private information, so sharing or copying these texts is strictly restricted. However, training a medical relation extraction model requires collecting these privacy-sensitive texts and storing them on one machine, which comes in conflict with privacy protection. In this paper, we propose a privacy-preserving medical relation extraction model based on federated learning, which enables training a central model with no single piece of private local data being shared or exchanged. Though federated learning has distinct advantages in privacy protection, it suffers from the communication bottleneck, which is mainly caused by the need to upload cumbersome local parameters. To overcome this bottleneck, we leverage a strategy based on knowledge distillation. Such a strategy uses the uploaded predictions of ensemble local models to train the central model without requiring uploading local parameters. Experiments on three publicly available medical relation extraction datasets demonstrate the effectiveness of our method.

pdf bib
Named Entity Recognition in Multi-level Contexts
Yubo Chen | Chuhan Wu | Tao Qi | Zhigang Yuan | Yongfeng Huang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Named entity recognition is a critical task in the natural language processing field. Most existing methods for this task can only exploit contextual information within a sentence. However, their performance on recognizing entities in limited or ambiguous sentence-level contexts is usually unsatisfactory. Fortunately, other sentences in the same document can provide supplementary document-level contexts to help recognize these entities. In addition, words themselves contain word-level contextual information since they usually have different preferences of entity type and relative position from named entities. In this paper, we propose a unified framework to incorporate multi-level contexts for named entity recognition. We use TagLM as our basic model to capture sentence-level contexts. To incorporate document-level contexts, we propose to capture interactions between sentences via a multi-head self attention network. To mine word-level contexts, we propose an auxiliary task to predict the type of each word to capture its type preference. We jointly train our model in entity recognition and the auxiliary classification task via multi-task learning. The experimental results on several benchmark datasets validate the effectiveness of our method.

pdf bib
Reconstructing Event Regions for Event Extraction via Graph Attention Networks
Pei Chen | Hang Yang | Kang Liu | Ruihong Huang | Yubo Chen | Taifeng Wang | Jun Zhao
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Event information is usually scattered across multiple sentences within a document. The local sentence-level event extractors often yield many noisy event role filler extractions in the absence of a broader view of the document-level context. Filtering spurious extractions and aggregating event information in a document remains a challenging problem. Following the observation that a document has several relevant event regions densely populated with event role fillers, we build graphs with candidate role filler extractions enriched by sentential embeddings as nodes, and use graph attention networks to identify event regions in a document and aggregate event information. We characterize edges between candidate extractions in a graph into rich vector representations to facilitate event region identification. The experimental results on two datasets of two languages show that our approach yields new state-of-the-art performance for the challenging event extraction task.

pdf bib
Towards Causal Explanation Detection with Pyramid Salient-Aware Network
Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Causal explanation analysis (CEA) can assist us to understand the reasons behind daily events, which has been found very helpful for understanding the coherence of messages. In this paper, we focus on Causal Explanation Detection, an important subtask of causal explanation analysis, which determines whether a causal explanation exists in one message. We design a Pyramid Salient-Aware Network (PSAN) to detect causal explanations on messages. PSAN can assist in causal explanation detection via capturing the salient semantics of discourses contained in their keywords with a bottom graph-based word-level salient network. Furthermore, PSAN can modify the dominance of discourses via a top attention-based discourse-level salient network to enhance explanatory semantics of messages. The experiments on the commonly used dataset of CEA shows that the PSAN outperforms the state-of-the-art method by 1.8% F1 value on the Causal Explanation Detection task.

pdf bib
Chinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Named entity recognition (NER) aims to identify text spans that mention named entities and classify them into pre-defined categories. For Chinese NER task, most of the existing methods are character-based sequence labeling models and achieve great success. However, these methods usually ignore lexical knowledge, which leads to false prediction of entity boundaries. Moreover, these methods have difficulties in capturing tag dependencies. In this paper, we propose an Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism (AMMNHT) to address all above problems. Specifically, to reduce the errors of predicting entity boundaries, we propose an adaptive multi-pass memory network to exploit lexical knowledge. In addition, we propose a hierarchical tagging layer to learn tag dependencies. Experimental results on three widely used Chinese NER datasets demonstrate that our proposed model significantly outperforms other state-of-the-art methods.

2019

pdf bib
Neural Cross-Lingual Event Detection with Minimal Parallel Resources
Jian Liu | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

pdf bib
Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network
Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The lack of word boundaries information has been seen as one of the main obstacles to develop a high performance Chinese named entity recognition (NER) system. Fortunately, the automatically constructed lexicon contains rich word boundaries information and word semantic information. However, integrating lexical knowledge in Chinese NER tasks still faces challenges when it comes to self-matched lexical words as well as the nearest contextual lexical words. We present a Collaborative Graph Network to solve these challenges. Experiments on various datasets show that our model not only outperforms the state-of-the-art (SOTA) results, but also achieves a speed that is six to fifteen times faster than that of the SOTA model.

pdf bib
THU_NGN at SemEval-2019 Task 12: Toponym Detection and Disambiguation on Scientific Papers
Tao Qi | Suyu Ge | Chuhan Wu | Yubo Chen | Yongfeng Huang
Proceedings of the 13th International Workshop on Semantic Evaluation

First name: Tao Last name: Qi Email: taoqi.qt@gmail.com Affiliation: Department of Electronic Engineering, Tsinghua University First name: Suyu Last name: Ge Email: gesy17@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Chuhan Last name: Wu Email: wuch15@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Yubo Last name: Chen Email: chen-yb18@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Yongfeng Last name: Huang Email: yfhuang@mail.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University Toponym resolution is an important and challenging task in the neural language processing field, and has wide applications such as emergency response and social media geographical event analysis. Toponym resolution can be roughly divided into two independent steps, i.e., toponym detection and toponym disambiguation. In order to facilitate the study on toponym resolution, the SemEval 2019 task 12 is proposed, which contains three subtasks, i.e., toponym detection, toponym disambiguation and toponym resolution. In this paper, we introduce our system that participated in the SemEval 2019 task 12. For toponym detection, in our approach we use TagLM as the basic model, and explore the use of various features in this task, such as word embeddings extracted from pre-trained language models, POS tags and lexical features extracted from dictionaries. For toponym disambiguation, we propose a heuristics rule-based method using toponym frequency and population. Our systems achieved 83.03% strict macro F1, 74.50 strict micro F1, 85.92 overlap macro F1 and 78.47 overlap micro F1 in toponym detection subtask.

2018

pdf bib
DCFEE: A Document-level Chinese Financial Event Extraction System based on Automatically Labeled Training Data
Hang Yang | Yubo Chen | Kang Liu | Yang Xiao | Jun Zhao
Proceedings of ACL 2018, System Demonstrations

We present an event extraction framework to detect event mentions and extract events from the document-level financial news. Up to now, methods based on supervised learning paradigm gain the highest performance in public datasets (such as ACE2005, KBP2015). These methods heavily depend on the manually labeled training data. However, in particular areas, such as financial, medical and judicial domains, there is no enough labeled data due to the high cost of data labeling process. Moreover, most of the current methods focus on extracting events from one sentence, but an event is usually expressed by multiple sentences in one document. To solve these problems, we propose a Document-level Chinese Financial Event Extraction (DCFEE) system which can automatically generate a large scaled labeled data and extract events from the whole document. Experimental results demonstrate the effectiveness of it

pdf bib
Neural Metaphor Detecting with CNN-LSTM Model
Chuhan Wu | Fangzhao Wu | Yubo Chen | Sixing Wu | Zhigang Yuan | Yongfeng Huang
Proceedings of the Workshop on Figurative Language Processing

Metaphors are figurative languages widely used in daily life and literatures. It’s an important task to detect the metaphors evoked by texts. Thus, the metaphor shared task is aimed to extract metaphors from plain texts at word level. We propose to use a CNN-LSTM model for this task. Our model combines CNN and LSTM layers to utilize both local and long-range contextual information for identifying metaphorical information. In addition, we compare the performance of the softmax classifier and conditional random field (CRF) for sequential labeling in this task. We also incorporated some additional features such as part of speech (POS) tags and word cluster to improve the performance of model. Our best model achieved 65.06% F-score in the all POS testing subtask and 67.15% in the verbs testing subtask.

pdf bib
Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Named entity recognition (NER) is an important task in natural language processing area, which needs to determine entities boundaries and classify them into pre-defined categories. For Chinese NER task, there is only a very small amount of annotated data available. Chinese NER task and Chinese word segmentation (CWS) task have many similar word boundaries. There are also specificities in each task. However, existing methods for Chinese NER either do not exploit word boundary information from CWS or cannot filter the specific information of CWS. In this paper, we propose a novel adversarial transfer learning framework to make full use of task-shared boundaries information and prevent the task-specific features of CWS. Besides, since arbitrary character can provide important cues when predicting entity type, we exploit self-attention to explicitly capture long range dependencies between two tokens. Experimental results on two different widely used datasets show that our proposed model significantly and consistently outperforms other state-of-the-art methods.

pdf bib
Collective Event Detection via a Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms
Yubo Chen | Hang Yang | Kang Liu | Jun Zhao | Yantao Jia
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Traditional approaches to the task of ACE event detection primarily regard multiple events in one sentence as independent ones and recognize them separately by using sentence-level information. However, events in one sentence are usually interdependent and sentence-level information is often insufficient to resolve ambiguities for some types of events. This paper proposes a novel framework dubbed as Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms (HBTNGMA) to solve the two problems simultaneously. Firstly, we propose a hierachical and bias tagging networks to detect multiple events in one sentence collectively. Then, we devise a gated multi-level attention to automatically extract and dynamically fuse the sentence-level and document-level information. The experimental results on the widely used ACE 2005 dataset show that our approach significantly outperforms other state-of-the-art methods.

2017

pdf bib
Automatically Labeled Data Generation for Large Scale Event Extraction
Yubo Chen | Shulin Liu | Xiang Zhang | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Modern models of event extraction for tasks like ACE are based on supervised learning of events from small hand-labeled data. However, hand-labeled training data is expensive to produce, in low coverage of event types, and limited in size, which makes supervised methods hard to extract large scale of events for knowledge base population. To solve the data labeling problem, we propose to automatically label training data for event extraction via world knowledge and linguistic knowledge, which can detect key arguments and trigger words for each event type and employ them to label events in texts automatically. The experimental results show that the quality of our large scale automatically labeled data is competitive with elaborately human-labeled data. And our automatically labeled data can incorporate with human-labeled data, then improve the performance of models learned from these data.

pdf bib
Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms
Shulin Liu | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper tackles the task of event detection (ED), which involves identifying and categorizing events. We argue that arguments provide significant clues to this task, but they are either completely ignored or exploited in an indirect manner in existing detection approaches. In this work, we propose to exploit argument information explicitly for ED via supervised attention mechanisms. In specific, we systematically investigate the proposed model under the supervision of different attention strategies. Experimental results show that our approach advances state-of-the-arts and achieves the best F1 score on ACE 2005 dataset.

2016

pdf bib
Leveraging FrameNet to Improve Automatic Event Detection
Shulin Liu | Yubo Chen | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks
Daojian Zeng | Kang Liu | Yubo Chen | Jun Zhao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks
Yubo Chen | Liheng Xu | Kang Liu | Daojian Zeng | Jun Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Group Non-negative Matrix Factorization with Natural Categories for Question Retrieval in Community Question Answer Archives
Guangyou Zhou | Yubo Chen | Daojian Zeng | Jun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Mining Opinion Words and Opinion Targets in a Two-Stage Framework
Liheng Xu | Kang Liu | Siwei Lai | Yubo Chen | Jun Zhao
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)