Extractive Summarization under Strict Length Constraints

Yashar Mehdad, Amanda Stent, Kapil Thadani, Dragomir Radev, Youssef Billawala, Karolina Buchner


Abstract
In this paper we report a comparison of various techniques for single-document extractive summarization under strict length budgets, which is a common commercial use case (e.g. summarization of news articles by news aggregators). We show that, evaluated using ROUGE, numerous algorithms from the literature fail to beat a simple lead-based baseline for this task. However, a supervised approach with lightweight and efficient features improves over the lead-based baseline. Additional human evaluation demonstrates that the supervised approach also performs competitively with a commercial system that uses more sophisticated features.
Anthology ID:
L16-1493
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3089–3093
Language:
URL:
https://aclanthology.org/L16-1493
DOI:
Bibkey:
Cite (ACL):
Yashar Mehdad, Amanda Stent, Kapil Thadani, Dragomir Radev, Youssef Billawala, and Karolina Buchner. 2016. Extractive Summarization under Strict Length Constraints. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3089–3093, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Extractive Summarization under Strict Length Constraints (Mehdad et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1493.pdf