Lei He


2021

pdf bib
KLMo: Knowledge Graph Enhanced Pretrained Language Model with Fine-Grained Relationships
Lei He | Suncong Zheng | Tao Yang | Feng Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

Interactions between entities in knowledge graph (KG) provide rich knowledge for language representation learning. However, existing knowledge-enhanced pretrained language models (PLMs) only focus on entity information and ignore the fine-grained relationships between entities. In this work, we propose to incorporate KG (including both entities and relations) into the language learning process to obtain KG-enhanced pretrained Language Model, namely KLMo. Specifically, a novel knowledge aggregator is designed to explicitly model the interaction between entity spans in text and all entities and relations in a contextual KG. An relation prediction objective is utilized to incorporate relation information by distant supervision. An entity linking objective is further utilized to link entity spans in text to entities in KG. In this way, the structured knowledge can be effectively integrated into language representations. Experimental results demonstrate that KLMo achieves great improvements on several knowledge-driven tasks, such as entity typing and relation classification, comparing with the state-of-the-art knowledge-enhanced PLMs.

2016

pdf bib
Abstractive News Summarization based on Event Semantic Link Network
Wei Li | Lei He | Hai Zhuge
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper studies the abstractive multi-document summarization for event-oriented news texts through event information extraction and abstract representation. Fine-grained event mentions and semantic relations between them are extracted to build a unified and connected event semantic link network, an abstract representation of source texts. A network reduction algorithm is proposed to summarize the most salient and coherent event information. New sentences with good linguistic quality are automatically generated and selected through sentences over-generation and greedy-selection processes. Experimental results on DUC 2006 and DUC 2007 datasets show that our system significantly outperforms the state-of-the-art extractive and abstractive baselines under both pyramid and ROUGE evaluation metrics.

pdf bib
Exploring Differential Topic Models for Comparative Summarization of Scientific Papers
Lei He | Wei Li | Hai Zhuge
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper investigates differential topic models (dTM) for summarizing the differences among document groups. Starting from a simple probabilistic generative model, we propose dTM-SAGE that explicitly models the deviations on group-specific word distributions to indicate how words are used differen-tially across different document groups from a background word distribution. It is more effective to capture unique characteristics for comparing document groups. To generate dTM-based comparative summaries, we propose two sentence scoring methods for measuring the sentence discriminative capacity. Experimental results on scientific papers dataset show that our dTM-based comparative summari-zation methods significantly outperform the generic baselines and the state-of-the-art comparative summarization methods under ROUGE metrics.

pdf bib
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network
Peilu Wang | Yao Qian | Frank K. Soong | Lei He | Hai Zhao
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies