DivHSK: Diverse Headline Generation using Self-Attention based Keyword Selection

Venkatesh E; Kaushal Maurya; Deepak Kumar; Maunendra Sankar Desarkar

doi:10.18653/v1/2023.findings-acl.118

DivHSK: Diverse Headline Generation using Self-Attention based Keyword Selection

Venkatesh E, Kaushal Maurya, Deepak Kumar, Maunendra Sankar Desarkar

Abstract

Diverse headline generation is an NLP task where given a news article, the goal is to generate multiple headlines that are true to the content of the article but are different among themselves. This task aims to exhibit and exploit semantically similar one-to-many relationships between a source news article and multiple target headlines. Toward this, we propose a novel model called DIVHSK. It has two components:KEYSELECT for selecting the important keywords, and SEQGEN, for finally generating the multiple diverse headlines. In KEYSELECT, we cluster the self-attention heads of the last layer of the pre-trained encoder and select the most-attentive theme and general keywords from the source article. Then, cluster-specific keyword sets guide the SEQGEN, a pre-trained encoder-decoder model, to generate diverse yet semantically similar headlines. The proposed model consistently outperformed existing literature and our strong baselines and emerged as a state-of-the-art model. We have also created a high-quality multi-reference headline dataset from news articles.

Anthology ID:: 2023.findings-acl.118
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1879–1891
Language:
URL:: https://aclanthology.org/2023.findings-acl.118/
DOI:: 10.18653/v1/2023.findings-acl.118
Bibkey:
Cite (ACL):: Venkatesh E, Kaushal Maurya, Deepak Kumar, and Maunendra Sankar Desarkar. 2023. DivHSK: Diverse Headline Generation using Self-Attention based Keyword Selection. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1879–1891, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: DivHSK: Diverse Headline Generation using Self-Attention based Keyword Selection (E et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.118.pdf
Video:: https://aclanthology.org/2023.findings-acl.118.mp4

PDF Cite Search Video Fix data