Diogo Pernes


2023

pdf bib
Supervising the Centroid Baseline for Extractive Multi-Document Summarization
Simão Gonçalves | Gonçalo Correia | Diogo Pernes | Afonso Mendes
Proceedings of the 4th New Frontiers in Summarization Workshop

The centroid method is a simple approach for extractive multi-document summarization and many improvements to its pipeline have been proposed. We further refine it by adding a beam search process to the sentence selection and also a centroid estimation attention model that leads to improved results. We demonstrate this in several multi-document summarization datasets, including in a multilingual scenario.

2022

pdf bib
Improving abstractive summarization with energy-based re-ranking
Diogo Pernes | Afonso Mendes | André F. T. Martins
Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)

Current abstractive summarization systems present important weaknesses which prevent their deployment in real-world applications, such as the omission of relevant information and the generation of factual inconsistencies (also known as hallucinations). At the same time, automatic evaluation metrics such as CTC scores (Deng et al., 2021) have been recently proposed that exhibit a higher correlation with human judgments than traditional lexical-overlap metrics such as ROUGE. In this work, we intend to close the loop by leveraging the recent advances in summarization metrics to create quality-aware abstractive summarizers. Namely, we propose an energy-based model that learns to re-rank summaries according to one or a combination of these metrics. We experiment using several metrics to train our energy-based re-ranker and show that it consistently improves the scores achieved by the predicted summaries. Nonetheless, human evaluation results show that the re-ranking approach should be used with care for highly abstractive summaries, as the available metrics are not yet sufficiently reliable for this purpose.