Le Sun


2021

pdf bib
From Discourse to Narrative: Knowledge Projection for Event Relation Extraction
Jialong Tang | Hongyu Lin | Meng Liao | Yaojie Lu | Xianpei Han | Le Sun | Weijian Xie | Jin Xu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Current event-centric knowledge graphs highly rely on explicit connectives to mine relations between events. Unfortunately, due to the sparsity of connectives, these methods severely undermine the coverage of EventKGs. The lack of high-quality labelled corpora further exacerbates that problem. In this paper, we propose a knowledge projection paradigm for event relation extraction: projecting discourse knowledge to narratives by exploiting the commonalities between them. Specifically, we propose Multi-tier Knowledge Projection Network (MKPNet), which can leverage multi-tier discourse knowledge effectively for event relation extraction. In this way, the labelled data requirement is significantly reduced, and implicit event relations can be effectively extracted. Intrinsic experimental results show that MKPNet achieves the new state-of-the-art performance and extrinsic experimental results verify the value of the extracted event relations.

pdf bib
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Boxi Cao | Hongyu Lin | Xianpei Han | Le Sun | Lingyong Yan | Meng Liao | Tong Xue | Jin Xu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases.

pdf bib
Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction
Yaojie Lu | Hongyu Lin | Jin Xu | Xianpei Han | Jialong Tang | Annan Li | Le Sun | Meng Liao | Shaoyi Chen
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Event extraction is challenging due to the complex structure of event records and the semantic gap between text and event. Traditional methods usually extract event records by decomposing the complex structure prediction task into multiple subtasks. In this paper, we propose Text2Event, a sequence-to-structure generation paradigm that can directly extract events from the text in an end-to-end manner. Specifically, we design a sequence-to-structure network for unified event extraction, a constrained decoding algorithm for event knowledge injection during inference, and a curriculum learning algorithm for efficient model learning. Experimental results show that, by uniformly modeling all tasks in a single model and universally predicting different labels, our method can achieve competitive performance using only record-level annotations in both supervised learning and transfer learning settings.

pdf bib
Element Intervention for Open Relation Extraction
Fangchao Liu | Lingyong Yan | Hongyu Lin | Xianpei Han | Le Sun
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Open relation extraction aims to cluster relation instances referring to the same underlying relation, which is a critical step for general relation extraction. Current OpenRE models are commonly trained on the datasets generated from distant supervision, which often results in instability and makes the model easily collapsed. In this paper, we revisit the procedure of OpenRE from a causal view. By formulating OpenRE using a structural causal model, we identify that the above-mentioned problems stem from the spurious correlations from entities and context to the relation type. To address this issue, we conduct Element Intervention, which intervene on the context and entities respectively to obtain the underlying causal effects of them. We also provide two specific implementations of the interventions based on entity ranking and context contrasting. Experimental results on unsupervised relation extraction datasets show our method to outperform previous state-of-the-art methods and is robust across different datasets.

pdf bib
De-biasing Distantly Supervised Named Entity Recognition via Causal Intervention
Wenkai Zhang | Hongyu Lin | Xianpei Han | Le Sun
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Distant supervision tackles the data bottleneck in NER by automatically generating training instances via dictionary matching. Unfortunately, the learning of DS-NER is severely dictionary-biased, which suffers from spurious correlations and therefore undermines the effectiveness and the robustness of the learned models. In this paper, we fundamentally explain the dictionary bias via a Structural Causal Model (SCM), categorize the bias into intra-dictionary and inter-dictionary biases, and identify their causes. Based on the SCM, we learn de-biased DS-NER via causal interventions. For intra-dictionary bias, we conduct backdoor adjustment to remove the spurious correlations introduced by the dictionary confounder. For inter-dictionary bias, we propose a causal invariance regularizer which will make DS-NER models more robust to the perturbation of dictionaries. Experiments on four datasets and three DS-NER models show that our method can significantly improve the performance of DS-NER.

pdf bib
From Paraphrasing to Semantic Parsing: Unsupervised Semantic Parsing via Synchronous Semantic Decoding
Shan Wu | Bo Chen | Chunlei Xin | Xianpei Han | Le Sun | Weipeng Zhang | Jiansong Chen | Fan Yang | Xunliang Cai
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Semantic parsing is challenging due to the structure gap and the semantic gap between utterances and logical forms. In this paper, we propose an unsupervised semantic parsing method - Synchronous Semantic Decoding (SSD), which can simultaneously resolve the semantic gap and the structure gap by jointly leveraging paraphrasing and grammar-constrained decoding. Specifically, we reformulate semantic parsing as a constrained paraphrasing problem: given an utterance, our model synchronously generates its canonical utterancel and meaning representation. During synchronously decoding: the utterance paraphrasing is constrained by the structure of the logical form, therefore the canonical utterance can be paraphrased controlledly; the semantic decoding is guided by the semantics of the canonical utterance, therefore its logical form can be generated unsupervisedly. Experimental results show that SSD is a promising approach and can achieve state-of-the-art unsupervised semantic parsing performance on multiple datasets.

2020

pdf bib
Syntactic and Semantic-driven Learning for Open Information Extraction
Jialong Tang | Yaojie Lu | Hongyu Lin | Xianpei Han | Le Sun | Xinyan Xiao | Hua Wu
Findings of the Association for Computational Linguistics: EMNLP 2020

One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora. The diversity of open domain corpora and the variety of natural language expressions further exacerbate this problem. In this paper, we propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data by leveraging syntactic and semantic knowledge as noisier, higher-level supervision. Specifically, we first employ syntactic patterns as data labelling functions and pretrain a base model using the generated labels. Then we propose a syntactic and semantic-driven reinforcement learning algorithm, which can effectively generalize the base model to open situations with high accuracy. Experimental results show that our approach significantly outperforms the supervised counterparts, and can even achieve competitive performance to supervised state-of-the-art (SoA) model.

pdf bib
Global Bootstrapping Neural Network for Entity Set Expansion
Lingyong Yan | Xianpei Han | Ben He | Le Sun
Findings of the Association for Computational Linguistics: EMNLP 2020

Bootstrapping for entity set expansion (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision. Recent end-to-end bootstrapping approaches have shown their advantages in information capturing and bootstrapping process modeling. However, due to the sparse supervision problem, previous end-to-end methods often only leverage information from near neighborhoods (local semantics) rather than those propagated from the co-occurrence structure of the whole corpus (global semantics). To address this issue, this paper proposes Global Bootstrapping Network (GBN) with the “pre-training and fine-tuning” strategies for effective learning. Specifically, it contains a global-sighted encoder to capture and encode both local and global semantics into entity embedding, and an attention-guided decoder to sequentially expand new entities based on these embeddings. The experimental results show that the GBN learned by “pre-training and fine-tuning” strategies achieves state-of-the-art performance on two bootstrapping datasets.

pdf bib
BERT-QE: Contextualized Query Expansion for Document Re-ranking
Zhi Zheng | Kai Hui | Ben He | Xianpei Han | Le Sun | Andrew Yates
Findings of the Association for Computational Linguistics: EMNLP 2020

Query expansion aims to mitigate the mismatch between the language used in a query and in a document. However, query expansion methods can suffer from introducing non-relevant information when expanding the query. To bridge this gap, inspired by recent advances in applying contextualized models like BERT to the document retrieval task, this paper proposes a novel query expansion model that leverages the strength of the BERT model to select relevant document chunks for expansion. In evaluation on the standard TREC Robust04 and GOV2 test collections, the proposed BERT-QE model significantly outperforms BERT-Large models.

pdf bib
A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
Hongyu Lin | Yaojie Lu | Jialong Tang | Xianpei Han | Le Sun | Zhicheng Wei | Nicholas Jing Yuan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Fine-tuning pretrained model has achieved promising performance on standard NER benchmarks. Generally, these benchmarks are blessed with strong name regularity, high mention coverage and sufficient context diversity. Unfortunately, when scaling NER to open situations, these advantages may no longer exist. And therefore it raises a critical question of whether previous creditable approaches can still work well when facing these challenges. As there is no currently available dataset to investigate this problem, this paper proposes to conduct randomization test on standard benchmarks. Specifically, we erase name regularity, mention coverage and context diversity respectively from the benchmarks, in order to explore their impact on the generalization ability of models. To further verify our conclusions, we also construct a new open NER dataset that focuses on entity types with weaker name regularity and lower mention coverage to verify our conclusion. From both randomization test and empirical experiments, we draw the conclusions that 1) name regularity is critical for the models to generalize to unseen mentions; 2) high mention coverage may undermine the model generalization ability and 3) context patterns may not require enormous data to capture when using pretrained encoders.

pdf bib
ISCAS at SemEval-2020 Task 5: Pre-trained Transformers for Counterfactual Statement Modeling
Yaojie Lu | Annan Li | Hongyu Lin | Xianpei Han | Le Sun
Proceedings of the Fourteenth Workshop on Semantic Evaluation

ISCAS participated in two subtasks of SemEval 2020 Task 5: detecting counterfactual statements and detecting antecedent and consequence. This paper describes our system which is based on pretrained transformers. For the first subtask, we train several transformer-based classifiers for detecting counterfactual statements. For the second subtask, we formulate antecedent and consequence extraction as a query-based question answering problem. The two subsystems both achieved third place in the evaluation. Our system is openly released at https://github.com/casnlu/ISCASSemEval2020Task5.

2019

pdf bib
Learning to Bootstrap for Entity Set Expansion
Lingyong Yan | Xianpei Han | Le Sun | Ben He
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category. Traditional bootstrapping methods often suffer from two problems: 1) delayed feedback, i.e., the pattern evaluation relies on both its direct extraction quality and extraction quality in later iterations. 2) sparse supervision, i.e., only few seed entities are used as the supervision. To address the above two problems, we propose a novel bootstrapping method combining the Monte Carlo Tree Search (MCTS) algorithm with a deep similarity network, which can efficiently estimate delayed feedback for pattern evaluation and adaptively score entities given sparse supervision signals. Experimental results confirm the effectiveness of the proposed method.

pdf bib
Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition
Hongyu Lin | Yaojie Lu | Xianpei Han | Le Sun | Bin Dong | Shanshan Jiang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Current region-based NER models only rely on fully-annotated training data to learn effective region encoder, which often face the training data bottleneck. To alleviate this problem, this paper proposes Gazetteer-Enhanced Attentive Neural Networks, which can enhance region-based NER by learning name knowledge of entity mentions from easily-obtainable gazetteers, rather than only from fully-annotated data. Specially, we first propose an attentive neural network (ANN), which explicitly models the mention-context association and therefore is convenient for integrating externally-learned knowledge. Then we design an auxiliary gazetteer network, which can effectively encode name regularity of mentions only using gazetteers. Finally, the learned gazetteer network is incorporated into ANN for better NER. Experiments show that our ANN can achieve the state-of-the-art performance on ACE2005 named entity recognition benchmark. Besides, incorporating gazetteer network can further improve the performance and significantly reduce the requirement of training data.

pdf bib
EUSP: An Easy-to-Use Semantic Parsing PlatForm
Bo An | Chen Bo | Xianpei Han | Le Sun
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

Semantic parsing aims to map natural language utterances into structured meaning representations. We present a modular platform, EUSP (Easy-to-Use Semantic Parsing PlatForm), that facilitates developers to build semantic parser from scratch. Instead of requiring a large amount of training data or complex grammar knowledge, in our platform developers can build grammar-based semantic parser or neural-based semantic parser through configure files which specify the modules and components that compose semantic parsing system. A high quality grammar-based semantic parsing system only requires domain lexicons rather than costly training data for a semantic parser. Furthermore, we provide a browser-based method to generate the semantic parsing system to minimize the difficulty of development. Experimental results show that the neural-based semantic parser system achieves competitive performance on semantic parsing task, and grammar-based semantic parsers significantly improve the performance of a business search engine.

pdf bib
Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis
Jialong Tang | Ziyao Lu | Jinsong Su | Yubin Ge | Linfeng Song | Le Sun | Jiebo Luo
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect. However, such a mechanism tends to excessively focus on a few frequent words with sentiment polarities, while ignoring infrequent ones. In this paper, we propose a progressive self-supervised attention learning approach for neural ASC models, which automatically mines useful attention supervision information from a training corpus to refine attention mechanisms. Specifically, we iteratively conduct sentiment predictions on all training instances. Particularly, at each iteration, the context word with the maximum attention weight is extracted as the one with active/misleading influence on the correct/incorrect prediction of every instance, and then the word itself is masked for subsequent iterations. Finally, we augment the conventional training objective with a regularization term, which enables ASC models to continue equally focusing on the extracted active context words while decreasing weights of those misleading ones. Experimental results on multiple datasets show that our proposed approach yields better attention mechanisms, leading to substantial improvements over the two state-of-the-art neural ASC models. Source code and trained models are available at https://github.com/DeepLearnXMU/PSSAttention.

pdf bib
Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning
Yaojie Lu | Hongyu Lin | Xianpei Han | Le Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Event detection systems rely on discrimination knowledge to distinguish ambiguous trigger words and generalization knowledge to detect unseen/sparse trigger words. Current neural event detection approaches focus on trigger-centric representations, which work well on distilling discrimination knowledge, but poorly on learning generalization knowledge. To address this problem, this paper proposes a Delta-learning approach to distill discrimination and generalization knowledge by effectively decoupling, incrementally learning and adaptively fusing event representation. Experiments show that our method significantly outperforms previous approaches on unseen/sparse trigger words, and achieves state-of-the-art performance on both ACE2005 and KBP2017 datasets.

pdf bib
Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks
Hongyu Lin | Yaojie Lu | Xianpei Han | Le Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Sequential labeling-based NER approaches restrict each word belonging to at most one entity mention, which will face a serious problem when recognizing nested entity mentions. In this paper, we propose to resolve this problem by modeling and leveraging the head-driven phrase structures of entity mentions, i.e., although a mention can nest other mentions, they will not share the same head word. Specifically, we propose Anchor-Region Networks (ARNs), a sequence-to-nuggets architecture for nested mention detection. ARNs first identify anchor words (i.e., possible head words) of all mentions, and then recognize the mention boundaries for each anchor word by exploiting regular phrase structures. Furthermore, we also design Bag Loss, an objective function which can train ARNs in an end-to-end manner without using any anchor word annotation. Experiments show that ARNs achieve the state-of-the-art performance on three standard nested entity mention detection benchmarks.

pdf bib
Cost-sensitive Regularization for Label Confusion-aware Event Detection
Hongyu Lin | Yaojie Lu | Xianpei Han | Le Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In supervised event detection, most of the mislabeling occurs between a small number of confusing type pairs, including trigger-NIL pairs and sibling sub-types of the same coarse type. To address this label confusion problem, this paper proposes cost-sensitive regularization, which can force the training procedure to concentrate more on optimizing confusing type pairs. Specifically, we introduce a cost-weighted term into the training loss, which penalizes more on mislabeling between confusing label pairs. Furthermore, we also propose two estimators which can effectively measure such label confusion based on instance-level or population-level statistics. Experiments on TAC-KBP 2017 datasets demonstrate that the proposed method can significantly improve the performances of different models in both English and Chinese event detection.

2018

pdf bib
Semi-Supervised Lexicon Learning for Wide-Coverage Semantic Parsing
Bo Chen | Bo An | Le Sun | Xianpei Han
Proceedings of the 27th International Conference on Computational Linguistics

Semantic parsers critically rely on accurate and high-coverage lexicons. However, traditional semantic parsers usually utilize annotated logical forms to learn the lexicon, which often suffer from the lexicon coverage problem. In this paper, we propose a graph-based semi-supervised learning framework that makes use of large text corpora and lexical resources. This framework first constructs a graph with a phrase similarity model learned by utilizing many text corpora and lexical resources. Next, graph propagation algorithm identifies the label distribution of unlabeled phrases from labeled ones. We evaluate our approach on two benchmarks: Webquestions and Free917. The results show that, in both datasets, our method achieves substantial improvement when comparing to the base system that does not utilize the learned lexicon, and gains competitive results when comparing to state-of-the-art systems.

pdf bib
Model-Free Context-Aware Word Composition
Bo An | Xianpei Han | Le Sun
Proceedings of the 27th International Conference on Computational Linguistics

Word composition is a promising technique for representation learning of large linguistic units (e.g., phrases, sentences and documents). However, most of the current composition models do not take the ambiguity of words and the context outside of a linguistic unit into consideration for learning representations, and consequently suffer from the inaccurate representation of semantics. To address this issue, we propose a model-free context-aware word composition model, which employs the latent semantic information as global context for learning representations. The proposed model attempts to resolve the word sense disambiguation and word composition in a unified framework. Extensive evaluation shows consistent improvements over various strong word representation/composition models at different granularities (including word, phrase and sentence), demonstrating the effectiveness of our proposed method.

pdf bib
Accurate Text-Enhanced Knowledge Graph Representation Learning
Bo An | Bo Chen | Xianpei Han | Le Sun
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Previous representation learning techniques for knowledge graph representation usually represent the same entity or relation in different triples with the same representation, without considering the ambiguity of relations and entities. To appropriately handle the semantic variety of entities/relations in distinct triples, we propose an accurate text-enhanced knowledge graph representation learning method, which can represent a relation/entity with different representations in different triples by exploiting additional textual information. Specifically, our method enhances representations by exploiting the entity descriptions and triple-specific relation mention. And a mutual attention mechanism between relation mention and entity description is proposed to learn more accurate textual representations for further improving knowledge graph representation. Experimental results show that our method achieves the state-of-the-art performance on both link prediction and triple classification tasks, and significantly outperforms previous text-enhanced knowledge representation models.

pdf bib
Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing
Bo Chen | Le Sun | Xianpei Han
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper proposes a neural semantic parsing approach – Sequence-to-Action, which models semantic parsing as an end-to-end semantic graph generation process. Our method simultaneously leverages the advantages from two recent promising directions of semantic parsing. Firstly, our model uses a semantic graph to represent the meaning of a sentence, which has a tight-coupling with knowledge bases. Secondly, by leveraging the powerful representation learning and prediction ability of neural network models, we propose a RNN model which can effectively map sentences to action sequences for semantic graph generation. Experiments show that our method achieves state-of-the-art performance on Overnight dataset and gets competitive performance on Geo and Atis datasets.

pdf bib
Adaptive Scaling for Sparse Detection in Information Extraction
Hongyu Lin | Yaojie Lu | Xianpei Han | Le Sun
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper focuses on detection tasks in information extraction, where positive instances are sparsely distributed and models are usually evaluated using F-measure on positive classes. These characteristics often result in deficient performance of neural network based detection models. In this paper, we propose adaptive scaling, an algorithm which can handle the positive sparsity problem and directly optimize over F-measure via dynamic cost-sensitive learning. To this end, we borrow the idea of marginal utility from economics and propose a theoretical framework for instance importance measuring without introducing any additional hyper-parameters. Experiments show that our algorithm leads to a more effective and stable training of neural network based detection models.

pdf bib
TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring
Cancan Jin | Ben He | Kai Hui | Le Sun
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Existing automated essay scoring (AES) models rely on rated essays for the target prompt as training data. Despite their successes in prompt-dependent AES, how to effectively predict essay ratings under a prompt-independent setting remains a challenge, where the rated essays for the target prompt are not available. To close this gap, a two-stage deep neural network (TDNN) is proposed. In particular, in the first stage, using the rated essays for non-target prompts as the training data, a shallow model is learned to select essays with an extreme quality for the target prompt, serving as pseudo training data; in the second stage, an end-to-end hybrid deep model is proposed to learn a prompt-dependent rating model consuming the pseudo training data from the first step. Evaluation of the proposed TDNN on the standard ASAP dataset demonstrates a promising improvement for the prompt-independent AES task.

pdf bib
Nugget Proposal Networks for Chinese Event Detection
Hongyu Lin | Yaojie Lu | Xianpei Han | Le Sun
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Neural network based models commonly regard event detection as a word-wise classification task, which suffer from the mismatch problem between words and event triggers, especially in languages without natural word delimiters such as Chinese. In this paper, we propose Nugget Proposal Networks (NPNs), which can solve the word-trigger mismatch problem by directly proposing entire trigger nuggets centered at each character regardless of word boundaries. Specifically, NPNs perform event detection in a character-wise paradigm, where a hybrid representation for each character is first learned to capture both structural and semantic information from both characters and words. Then based on learned representations, trigger nuggets are proposed and categorized by exploiting character compositional structures of Chinese event triggers. Experiments on both ACE2005 and TAC KBP 2017 datasets show that NPNs significantly outperform the state-of-the-art methods.

pdf bib
NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval
Canjia Li | Yingfei Sun | Ben He | Le Wang | Kai Hui | Andrew Yates | Le Sun | Jungang Xu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Pseudo relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches. While neural retrieval models have recently demonstrated strong results for ad-hoc retrieval, combining them with PRF is not straightforward due to incompatibilities between existing PRF approaches and neural architectures. To bridge this gap, we propose an end-to-end neural PRF framework that can be used with existing neural IR models by embedding different neural models as building blocks. Extensive experiments on two standard test collections confirm the effectiveness of the proposed NPRF framework in improving the performance of two state-of-the-art neural IR models.

2017

pdf bib
Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus
Kees van Deemter | Le Sun | Rint Sybesma | Xiao Li | Bo Chen | Muyun Yang
Proceedings of the 10th International Conference on Natural Language Generation

East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number. We present the first Data-Text corpus for Referring Expressions in Mandarin, and we use this corpus to test some initial hypotheses inspired by the theoretical linguistics literature. Our findings suggest that function words deserve more attention in Referring Expressions Generation than they have so far received, and they have a bearing on the debate about whether different languages make different trade-offs between clarity and brevity.

pdf bib
Reasoning with Heterogeneous Knowledge for Commonsense Machine Comprehension
Hongyu Lin | Le Sun | Xianpei Han
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Reasoning with commonsense knowledge is critical for natural language understanding. Traditional methods for commonsense machine comprehension mostly only focus on one specific kind of knowledge, neglecting the fact that commonsense reasoning requires simultaneously considering different kinds of commonsense knowledge. In this paper, we propose a multi-knowledge reasoning method, which can exploit heterogeneous knowledge for commonsense machine comprehension. Specifically, we first mine different kinds of knowledge (including event narrative knowledge, entity semantic knowledge and sentiment coherent knowledge) and encode them as inference rules with costs. Then we propose a multi-knowledge reasoning model, which selects inference rules for a specific reasoning context using attention mechanism, and reasons by summarizing all valid inference rules. Experiments on RocStories show that our method outperforms traditional models significantly.

2016

pdf bib
Sentence Rewriting for Semantic Parsing
Bo Chen | Le Sun | Xianpei Han | Bo An
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Context-Sensitive Inference Rule Discovery: A Graph-Based Method
Xianpei Han | Le Sun
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Inference rule discovery aims to identify entailment relations between predicates, e.g., ‘X acquire Y –> X purchase Y’ and ‘X is author of Y –> X write Y’. Traditional methods dis-cover inference rules by computing distributional similarities between predicates, with each predicate is represented as one or more feature vectors of its instantiations. These methods, however, have two main drawbacks. Firstly, these methods are mostly context-insensitive, cannot accurately measure the similarity between two predicates in a specific context. Secondly, traditional methods usually model predicates independently, ignore the rich inter-dependencies between predicates. To address the above two issues, this pa-per proposes a graph-based method, which can discover inference rules by effectively modelling and exploiting both the context and the inter-dependencies between predicates. Specifically, we propose a graph-based representation—Predicate Graph, which can capture the semantic relevance between predicates using both the predicate-feature co-occurrence statistics and the inter-dependencies between predicates. Based on the predicate graph, we propose a context-sensitive random walk algorithm, which can learn con-text-specific predicate representations by distinguishing context-relevant information from context-irrelevant information. Experimental results show that our method significantly outperforms traditional inference rule discovery methods.

pdf bib
ISCAS_NLP at SemEval-2016 Task 1: Sentence Similarity Based on Support Vector Regression using Multiple Features
Cheng Fu | Bo An | Xianpei Han | Le Sun
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Learning to Mine Query Subtopics from Query Log
Zhenzhong Zhang | Le Sun | Xianpei Han
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing
Le Sun | Chengqing Zong | Min Zhang | Gina-Anne Levow
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Probabilistic Co-Bootstrapping Method for Entity Set Expansion
Bei Shi | Zhenzhong Zhang | Le Sun | Xianpei Han
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A Feature-Enriched Tree Kernel for Relation Extraction
Le Sun | Xianpei Han
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Semantic Consistency: A Local Subspace Based Method for Distant Supervised Relation Extraction
Xianpei Han | Le Sun
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
An Entity-Topic Model for Entity Linking
Xianpei Han | Le Sun
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
A Cascaded Approach for CIPS-SIGHAN Micro-Blog Word Segmentation Bakeoff 2012
Bei Shi | Xianpei Han | Le Sun
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
SIR-NERD: A Chinese Named Entity Recognition and Disambiguation System using a Two-Stage Method
Zehuan Peng | Le Sun | Xianpei Han
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

2011

pdf bib
Improving Word Sense Induction by Exploiting Semantic Relevance
Zhenzhong Zhang | Le Sun
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
A Generative Entity-Mention Model for Linking Entities with Knowledge Base
Xianpei Han | Le Sun
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Overview of the Chinese Word Sense Induction Task at CLP2010
Le Sun | Zhenzhong Zhang | Qiang Dong
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
ISCAS: A System for Chinese Word Sense Induction Based on K-means Algorithm
Zhenzhong Zhang | Le Sun | Wenbo Li
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf bib
A Syllable-based Name Transliteration System
Xue Jiang | Le Sun | Dakun Zhang
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2008

pdf bib
A Structured Prediction Approach for Statistical Machine Translation
Dakun Zhang | Le Sun | Wenbo Li
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
Two Step Chinese Named Entity Recognition Based on Conditional Random Fields Models
Yuanyong Feng | Ruihong Huang | Le Sun
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2006

pdf bib
An Iterative Implicit Feedback Approach to Personalized Search
Yuanhua Lv | Le Sun | Junlin Zhang | Jian-Yun Nie | Wan Chen | Wei Zhang
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Chinese Word Segmentation and Named Entity Recognition Based on Conditional Random Fields Models
Yuanyong Feng | Le Sun | Yuanhua Lv
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing

2004

pdf bib
A Trigger Language Model-based IR System
Jun-lin Zhang | Le Sun | Wei-min Qu | Yu-fang Sun
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2002

pdf bib
Constructing of a Large-Scale Chinese-English Parallel Corpus
Le Sun | Song Xue | Weimin Qu | Xiaofeng Wang | Yufang Sun
COLING-02: The 3rd Workshop on Asian Language Resources and International Standardization

2000

pdf bib
Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora
Le Sun | Youbing Jin | Lin Du | Yufang Sun
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Query Translation in Chinese-English Cross-Language Information Retrieval
Yibo Zhang | Le Sun | Lin Du | Yufang Sun
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
Word Alignment of English-Chinese Bilingual Corpus Based on Chucks
Le Sun | Youbing Jin | Lin Du | Yufang Sun
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora