A Graph-theoretic Summary Evaluation for ROUGE

Elaheh ShafieiBavani, Mohammad Ebrahimi, Raymond Wong, Fang Chen


Abstract
ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate summaries including lexical variations and paraphrasing. We propose a graph-based approach adopted into ROUGE to evaluate summaries based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets show that exploiting the lexico-semantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.
Anthology ID:
D18-1085
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
762–767
Language:
URL:
https://aclanthology.org/D18-1085
DOI:
10.18653/v1/D18-1085
Bibkey:
Cite (ACL):
Elaheh ShafieiBavani, Mohammad Ebrahimi, Raymond Wong, and Fang Chen. 2018. A Graph-theoretic Summary Evaluation for ROUGE. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 762–767, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
A Graph-theoretic Summary Evaluation for ROUGE (ShafieiBavani et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1085.pdf