A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining

Chenguang Zhu, Ruochen Xu, Michael Zeng, Xuedong Huang


Abstract
With the abundance of automatic meeting transcripts, meeting summarization is of great interest to both participants and other parties. Traditional methods of summarizing meetings depend on complex multi-step pipelines that make joint optimization intractable. Meanwhile, there are a handful of deep neural models for text summarization and dialogue systems. However, the semantic structure and styles of meeting transcripts are quite different from articles and conversations. In this paper, we propose a novel abstractive summary network that adapts to the meeting scenario. We design a hierarchical structure to accommodate long meeting transcripts and a role vector to depict the difference among speakers. Furthermore, due to the inadequacy of meeting summary data, we pretrain the model on large-scale news summary data. Empirical results show that our model outperforms previous approaches in both automatic metrics and human evaluation. For example, on ICSI dataset, the ROUGE-1 score increases from 34.66% to 46.28%.
Anthology ID:
2020.findings-emnlp.19
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
194–203
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.19
DOI:
10.18653/v1/2020.findings-emnlp.19
Bibkey:
Cite (ACL):
Chenguang Zhu, Ruochen Xu, Michael Zeng, and Xuedong Huang. 2020. A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 194–203, Online. Association for Computational Linguistics.
Cite (Informal):
A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining (Zhu et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.19.pdf
Video:
 https://slideslive.com/38940695
Code
 microsoft/HMNet +  additional community code
Data
CNN/Daily Mail