Towards Abstractive Grounded Summarization of Podcast Transcripts

Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Fei Liu


Abstract
Podcasts have shown a recent rise in popularity. Summarization of podcasts is of practical benefit to both content providers and consumers. It helps people quickly decide whether they will listen to a podcast and/or reduces the cognitive load of content providers to write summaries. Nevertheless, podcast summarization faces significant challenges including factual inconsistencies of summaries with respect to the inputs. The problem is exacerbated by speech disfluencies and recognition errors in transcripts of spoken language. In this paper, we explore a novel abstractive summarization method to alleviate these issues. Our approach learns to produce an abstractive summary while grounding summary segments in specific regions of the transcript to allow for full inspection of summary details. We conduct a series of analyses of the proposed approach on a large podcast dataset and show that the approach can achieve promising results. Grounded summaries bring clear benefits in locating the summary and transcript segments that contain inconsistent information, and hence improve summarization quality in terms of automatic and human evaluation.
Anthology ID:
2022.acl-long.302
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4407–4418
Language:
URL:
https://aclanthology.org/2022.acl-long.302
DOI:
10.18653/v1/2022.acl-long.302
Bibkey:
Cite (ACL):
Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, and Fei Liu. 2022. Towards Abstractive Grounded Summarization of Podcast Transcripts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4407–4418, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Towards Abstractive Grounded Summarization of Podcast Transcripts (Song et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.302.pdf
Code
 tencent-ailab/grndpodcastsum