Template-based Abstractive Microblog Opinion Summarization

Iman Munire Bilal, Bo Wang, Adam Tsakalidis, Dong Nguyen, Rob Procter, Maria Liakata


Abstract
We introduce the task of microblog opinion summarization (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarization dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarizing news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favors extractive summarization models. To showcase the dataset’s utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarization models and achieve good performance, with the former outperforming the latter. We also show that fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.
Anthology ID:
2022.tacl-1.71
Volume:
Transactions of the Association for Computational Linguistics, Volume 10
Month:
Year:
2022
Address:
Cambridge, MA
Editors:
Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1229–1248
Language:
URL:
https://aclanthology.org/2022.tacl-1.71
DOI:
10.1162/tacl_a_00516
Bibkey:
Cite (ACL):
Iman Munire Bilal, Bo Wang, Adam Tsakalidis, Dong Nguyen, Rob Procter, and Maria Liakata. 2022. Template-based Abstractive Microblog Opinion Summarization. Transactions of the Association for Computational Linguistics, 10:1229–1248.
Cite (Informal):
Template-based Abstractive Microblog Opinion Summarization (Bilal et al., TACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.tacl-1.71.pdf
Video:
 https://aclanthology.org/2022.tacl-1.71.mp4