Bernhard Pflugfelder
2024
RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
Mohammed Abdul Khaliq
|
Paul Yu-Chun Chang
|
Mingyang Ma
|
Bernhard Pflugfelder
|
Filip Miletić
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
The escalating challenge of misinformation, particularly in political discourse, requires advanced fact-checking solutions; this is even clearer in the more complex scenario of multimodal claims. We tackle this issue using a multimodal large language model in conjunction with retrieval-augmented generation (RAG), and introduce two novel reasoning techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). They fact-check multimodal claims by extracting both textual and image content, retrieving external information, and reasoning subsequent questions to be answered based on prior evidence. We achieve a weighted F1-score of 0.85, surpassing a baseline reasoning technique by 0.14 points. Human evaluation confirms that the vast majority of our generated fact-check explanations contain all information from gold standard data.
2022
User Satisfaction Modeling with Domain Adaptation in Task-oriented Dialogue Systems
Yan Pan
|
Mingyang Ma
|
Bernhard Pflugfelder
|
Georg Groh
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
User Satisfaction Estimation (USE) is crucial in helping measure the quality of a task-oriented dialogue system. However, the complex nature of implicit responses poses challenges in detecting user satisfaction, and most datasets are limited in size or not available to the public due to user privacy policies. Unlike task-oriented dialogue, large-scale annotated chitchat with emotion labels is publicly available. Therefore, we present a novel user satisfaction model with domain adaptation (USMDA) to utilize this chitchat. We adopt a dialogue Transformer encoder to capture contextual features from the dialogue. And we reduce domain discrepancy to learn dialogue-related invariant features. Moreover, USMDA jointly learns satisfaction signals in the chitchat context with user satisfaction estimation, and user actions in task-oriented dialogue with dialogue action recognition. Experimental results on two benchmarks show that our proposed framework for the USE task outperforms existing unsupervised domain adaptation methods. To the best of our knowledge, this is the first work to study user satisfaction estimation with unsupervised domain adaptation from chitchat to task-oriented dialogue.
Search
Fix data
Co-authors
- Mingyang Ma 2
- Paul Yu-Chun Chang 1
- Georg Groh 1
- Mohammed Abdul Khaliq 1
- Filip Miletić 1
- show all...
- Yan Pan 1