WSL-DS: Weakly Supervised Learning with Distant Supervision for Query Focused Multi-Document Abstractive Summarization

Md Tahmid Rahman Laskar, Enamul Hoque, Jimmy Xiangji Huang


Abstract
In the Query Focused Multi-Document Summarization (QF-MDS) task, a set of documents and a query are given where the goal is to generate a summary from these documents based on the given query. However, one major challenge for this task is the lack of availability of labeled training datasets. To overcome this issue, in this paper, we propose a novel weakly supervised learning approach via utilizing distant supervision. In particular, we use datasets similar to the target dataset as the training data where we leverage pre-trained sentence similarity models to generate the weak reference summary of each individual document in a document set from the multi-document gold reference summaries. Then, we iteratively train our summarization model on each single-document to alleviate the computational complexity issue that occurs while training neural summarization models in multiple documents (i.e., long sequences) at once. Experimental results on the Document Understanding Conferences (DUC) datasets show that our proposed approach sets a new state-of-the-art result in terms of various evaluation metrics.
Anthology ID:
2020.coling-main.495
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5647–5654
Language:
URL:
https://aclanthology.org/2020.coling-main.495
DOI:
10.18653/v1/2020.coling-main.495
Bibkey:
Cite (ACL):
Md Tahmid Rahman Laskar, Enamul Hoque, and Jimmy Xiangji Huang. 2020. WSL-DS: Weakly Supervised Learning with Distant Supervision for Query Focused Multi-Document Abstractive Summarization. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5647–5654, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
WSL-DS: Weakly Supervised Learning with Distant Supervision for Query Focused Multi-Document Abstractive Summarization (Laskar et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.495.pdf
Code
 tahmedge/WSL-DS-COLING-2020
Data
CNN/Daily MailMS MARCO