Steffen Thoma


2024

pdf bib
FZI-WIM at AVeriTeC Shared Task: Real-World Fact-Checking with Question Answering
Jin Liu | Steffen Thoma | Achim Rettinger
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)

This paper describes the FZI-WIM system at the AVeriTeC shared Task, which aims to assess evidence-based automated fact-checking systems for real-world claims with evidence retrieved from the web. The FZI-WIM system utilizes open-source models to build a reliable fact-checking pipeline via question-answering. With different experimental setups, we show that more questions lead to higher scores in the shared task. Both in question generation and question-answering stages, sampling can be a way to improve the performance of our system. We further analyze the limitations of current open-source models for real-world claim verification. Our code is publicly available https://github.com/jens5588/FZI-WIM-AVERITEC.

pdf bib
FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in Biomedical Domain
Jin Liu | Steffen Thoma
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes the inference system of FZI-WIM at the SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. Our system utilizes the chain of thought (CoT) paradigm to tackle this complex reasoning problem and further improve the CoT performance with self-consistency. Instead of greedy decoding, we sample multiple reasoning chains with the same prompt and make thefinal verification with majority voting. The self-consistent CoT system achieves a baseline F1 score of 0.80 (1st), faithfulness score of 0.90 (3rd), and consistency score of 0.73 (12th). We release the code and data publicly.