TWEETSUMM - A Dialog Summarization Dataset for Customer Service

Guy Feigenblat, Chulaka Gunasekara, Benjamin Sznajder, Sachindra Joshi, David Konopnicki, Ranit Aharonov


Abstract
In a typical customer service chat scenario, customers contact a support center to ask for help or raise complaints, and human agents try to solve the issues. In most cases, at the end of the conversation, agents are asked to write a short summary emphasizing the problem and the proposed solution, usually for the benefit of other agents that may have to deal with the same customer or issue. The goal of the present article is advancing the automation of this task. We introduce the first large scale, high quality, customer care dialog summarization dataset with close to 6500 human annotated summaries. The data is based on real-world customer support dialogs and includes both extractive and abstractive summaries. We also introduce a new unsupervised, extractive summarization method specific to dialogs.
Anthology ID:
2021.findings-emnlp.24
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
245–260
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.24
DOI:
10.18653/v1/2021.findings-emnlp.24
Bibkey:
Cite (ACL):
Guy Feigenblat, Chulaka Gunasekara, Benjamin Sznajder, Sachindra Joshi, David Konopnicki, and Ranit Aharonov. 2021. TWEETSUMM - A Dialog Summarization Dataset for Customer Service. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 245–260, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
TWEETSUMM - A Dialog Summarization Dataset for Customer Service (Feigenblat et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.24.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.24.mp4