Maik Fröbe

2024

pdf bib abs
DeepCT-enhanced Lexical Argument Retrieval
Alexander Bondarenko | Maik Fröbe | Danik Hollatz | Jan Heinrich Merker | Matthias Hagen
Proceedings of the 11th Workshop on Argument Mining (ArgMining 2024)

The recent Touché lab’s argument retrieval task focuses on controversial topics like ‘Should bottled water be banned?’ and asks to retrieve relevant pro/con arguments. Interestingly, the most effective systems submitted to that task still are based on lexical retrieval models like BM25. In other domains, neural retrievers that capture semantics are more effective than lexical baselines. To add more “semantics” to argument retrieval, we propose to combine lexical models with DeepCT-based document term weights. Our evaluation shows that our approach is more effective than all the systems submitted to the Touché lab while being on par with modern neural re-rankers that themselves are computationally more expensive.

2023

pdf bib abs
Stance-Aware Re-Ranking for Non-factual Comparative Queries
Jan Heinrich Reimer | Alexander Bondarenko | Maik Fröbe | Matthias Hagen
Proceedings of the 10th Workshop on Argument Mining

We propose a re-ranking approach to improve the retrieval effectiveness for non-factual comparative queries like ‘Which city is better, London or Paris?’ based on whether the results express a stance towards the comparison objects (London vs. Paris) or not. Applied to the 26 runs submitted to the Touché 2022 task on comparative argument retrieval, our stance-aware re-ranking significantly improves the retrieval effectiveness for all runs when perfect oracle-style stance labels are available. With our most effective practical stance detector based on GPT-3.5 (F₁ of 0.49 on four stance classes), our re-ranking still improves the effectiveness for all runs but only six improvements are significant. Artificially “deteriorating” the oracle-style labels, we further find that an F₁ of 0.90 for stance detection is necessary to significantly improve the retrieval effectiveness for the best run via stance-aware re-ranking.

pdf bib abs
SemEval-2023 Task 5: Clickbait Spoiling
Maik Fröbe | Benno Stein | Tim Gollub | Matthias Hagen | Martin Potthast
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this overview paper, we report on the second PAN~Clickbait Challenge hosted as Task~5 at SemEval~2023. The challenge’s focus is to better support social media users by automatically generating short spoilers that close the curiosity gap induced by a clickbait post. We organized two subtasks: (1) spoiler type classification to assess what kind of spoiler a clickbait post warrants (e.g., a phrase), and (2) spoiler generation to generate an actual spoiler for a clickbait post.

2022

pdf bib abs
Clickbait Spoiling via Question Answering and Passage Retrieval
Matthias Hagen | Maik Fröbe | Artur Jurk | Martin Potthast
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts—the Webis Clickbait Spoiling Corpus 2022—shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.

pdf bib abs
SCAI-QReCC Shared Task on Conversational Question Answering
Svitlana Vakulenko | Johannes Kiesel | Maik Fröbe
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Search-Oriented Conversational AI (SCAI) is an established venue that regularly puts a spotlight upon the recent work advancing the field of conversational search. SCAI’21 was organised as an independent online event and featured a shared task on conversational question answering, on which this paper reports. The shared task featured three subtasks that correspond to three steps in conversational question answering: question rewriting, passage retrieval, and answer generation. This report discusses each subtask, but emphasizes the answer generation subtask as it attracted the most attention from the participants and we identified evaluation of answer correctness in the conversational settings as a major challenge and acurrent research gap. Alongside the automatic evaluation, we conducted two crowdsourcing experiments to collect annotations for answer plausibility and faithfulness. As a result of this shared task, the original conversational QA dataset used for evaluation was further extended with alternative correct answers produced by the participant systems.

Co-authors

Artur Jurk 1

Johannes Kiesel 1

Jan Heinrich Merker 1

Jan Heinrich Reimer 1

Benno Stein 1

Svitlana Vakulenko 1

Venues

Fix author