Camille Thibault
2024
Uncertainty Resolution in Misinformation Detection
Yury Orlovskiy
|
Camille Thibault
|
Anne Imouza
|
Jean-François Godbout
|
Reihaneh Rabbany
|
Kellin Pelrine
Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024)
Misinformation poses a variety of risks, such as undermining public trust and distorting factual discourse. Large Language Models (LLMs) like GPT-4 have been shown effective in mitigating misinformation, particularly in handling statements where enough context is provided. However, they struggle to assess ambiguous or context-deficient statements accurately. This work introduces a new method to resolve uncertainty in such statements. We propose a framework to categorize missing information and publish category labels for the LIAR-New dataset, which is adaptable to cross-domain content with missing information. We then leverage this framework to generate effective user queries for missing context. Compared to baselines, our method improves the rate at which generated questions are answerable by the user by 38 percentage points and classification performance by over 10 percentage points macro F1. Thus, this approach may provide a valuable component for future misinformation mitigation pipelines.
2023
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Kellin Pelrine
|
Anne Imouza
|
Camille Thibault
|
Meilina Reksoprodjo
|
Caleb Gupta
|
Joel Christoph
|
Jean-François Godbout
|
Reihaneh Rabbany
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Misinformation poses a critical societal challenge, and current approaches have yet to produce an effective solution. We propose focusing on generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible. We first demonstrate that GPT-4 can outperform prior methods in multiple settings and languages. Next, we explore generalization, revealing that GPT-4 and RoBERTa-large exhibit differences in failure modes. Third, we propose techniques to handle uncertainty that can detect impossible examples and strongly improve outcomes. We also discuss results on other language models, temperature, prompting, versioning, explainability, and web retrieval, each one providing practical insights and directions for future research. Finally, we publish the LIAR-New dataset with novel paired English and French misinformation data and Possibility labels that indicate if there is sufficient context for veracity evaluation. Overall, this research lays the groundwork for future tools that can drive real-world progress to combat misinformation.
Search
Co-authors
- Kellin Pelrine 2
- Anne Imouza 2
- Jean-François Godbout 2
- Reihaneh Rabbany 2
- Meilina Reksoprodjo 1
- show all...