Tomás Vergara Browne
2024
From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP
Marius Mosbach
|
Vagrant Gautam
|
Tomás Vergara Browne
|
Dietrich Klakow
|
Mor Geva
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Interpretability and analysis (IA) research is a growing subfield within NLP with the goal of developing a deeper understanding of the behavior or inner workings of NLP systems and methods. Despite growing interest in the subfield, a criticism of this work is that it lacks actionable insights and therefore has little impact on NLP. In this paper, we seek to quantify the impact of IA research on the broader field of NLP. We approach this with a mixed-methods analysis of: (1) a citation graph of 185K+ papers built from all papers published at ACL and EMNLP conferences from 2018 to 2023, and their references and citations, and (2) a survey of 138 members of the NLP community. Our quantitative results show that IA work is well-cited outside of IA, and central in the NLP citation graph. Through qualitative analysis of survey responses and manual annotation of 556 papers, we find that NLP researchers build on findings from IA work and perceive it as important for progress in NLP, multiple subfields, and rely on its findings and terminology for their own work. Many novel methods are proposed based on IA findings and highly influenced by them, but highly influential non-IA work cites IA findings without being driven by them. We end by summarizing what is missing in IA work today and provide a call to action, to pave the way for a more impactful future of IA research.
2023
Large Language Models are biased to overestimate profoundness
Eugenio Herrera-Berg
|
Tomás Vergara Browne
|
Pablo León-Villagrá
|
Marc-Lluís Vives
|
Cristian Buc Calderon
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Recent advancements in natural language processing by large language models (LLMs), such as GPT-4, have been suggested to approach Artificial General Intelligence. And yet, it is still under dispute whether LLMs possess similar reasoning abilities to humans. This study evaluates GPT-4 and various other LLMs in judging the profoundness of mundane, motivational, and pseudo-profound statements. We found a significant statement-to-statement correlation between the LLMs and humans, irrespective of the type of statements and the prompting technique used. However, LLMs systematically overestimate the profoundness of nonsensical statements, with the exception of Tk-instruct, which uniquely underestimates the profoundness of statements. Only few-shot learning prompts, as opposed to chain-of-thought prompting, draw LLMs ratings closer to humans. Furthermore, this work provides insights into the potential biases induced by Reinforcement Learning from Human Feedback (RLHF), inducing an increase in the bias to overestimate the profoundness of statements.
Search
Fix data
Co-authors
- Cristian Buc Calderon 1
- Vagrant Gautam 1
- Mor Geva 1
- Eugenio Herrera-Berg 1
- Dietrich Klakow 1
- show all...