SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy

Umanga Bista, Alexander Mathews, Aditya Menon, Lexing Xie


Abstract
Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a novel technique for generic and update summarization based on the maximum mean discrepancy from kernel two-sample testing. SupMMD combines both supervised learning for salience and unsupervised learning for coverage and diversity. Further, we adapt multiple kernel learning to make use of similarity across multiple information sources (e.g., text features and knowledge based concepts). We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
Anthology ID:
2020.findings-emnlp.367
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4108–4122
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.367
DOI:
10.18653/v1/2020.findings-emnlp.367
Bibkey:
Cite (ACL):
Umanga Bista, Alexander Mathews, Aditya Menon, and Lexing Xie. 2020. SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4108–4122, Online. Association for Computational Linguistics.
Cite (Informal):
SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy (Bista et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.367.pdf
Video:
 https://slideslive.com/38940131
Code
 computationalmedia/supmmd