Hai Zhuge


2018

pdf bib
Abstractive Text-Image Summarization Using Multi-Modal Attentional Hierarchical RNN
Jingqiang Chen | Hai Zhuge
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Rapid growth of multi-modal documents on the Internet makes multi-modal summarization research necessary. Most previous research summarizes texts or images separately. Recent neural summarization research shows the strength of the Encoder-Decoder model in text summarization. This paper proposes an abstractive text-image summarization model using the attentional hierarchical Encoder-Decoder model to summarize a text document and its accompanying images simultaneously, and then to align the sentences and images in summaries. A multi-modal attentional mechanism is proposed to attend original sentences, images, and captions when decoding. The DailyMail dataset is extended by collecting images and captions from the Web. Experiments show our model outperforms the neural abstractive and extractive text summarization methods that do not consider images. In addition, our model can generate informative summaries of images.

2016

pdf bib
Abstractive News Summarization based on Event Semantic Link Network
Wei Li | Lei He | Hai Zhuge
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper studies the abstractive multi-document summarization for event-oriented news texts through event information extraction and abstract representation. Fine-grained event mentions and semantic relations between them are extracted to build a unified and connected event semantic link network, an abstract representation of source texts. A network reduction algorithm is proposed to summarize the most salient and coherent event information. New sentences with good linguistic quality are automatically generated and selected through sentences over-generation and greedy-selection processes. Experimental results on DUC 2006 and DUC 2007 datasets show that our system significantly outperforms the state-of-the-art extractive and abstractive baselines under both pyramid and ROUGE evaluation metrics.

pdf bib
Exploring Differential Topic Models for Comparative Summarization of Scientific Papers
Lei He | Wei Li | Hai Zhuge
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper investigates differential topic models (dTM) for summarizing the differences among document groups. Starting from a simple probabilistic generative model, we propose dTM-SAGE that explicitly models the deviations on group-specific word distributions to indicate how words are used differen-tially across different document groups from a background word distribution. It is more effective to capture unique characteristics for comparing document groups. To generate dTM-based comparative summaries, we propose two sentence scoring methods for measuring the sentence discriminative capacity. Experimental results on scientific papers dataset show that our dTM-based comparative summari-zation methods significantly outperform the generic baselines and the state-of-the-art comparative summarization methods under ROUGE metrics.

2013

pdf bib
Are School-of-thought Words Characterizable?
Xiaorui Jiang | Xiaoping Sun | Hai Zhuge
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)