So Yeon Min


2024

pdf bib
Tools Fail: Detecting Silent Errors in Faulty Tools
Jimin Sun | So Yeon Min | Yingshan Chang | Yonatan Bisk
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Tools have become a mainstay of LLMs, allowing them to retrieve knowledge not in their weights, to perform tasks on the web, and even to control robots. However, most ontologies and surveys of tool-use have assumed the core challenge for LLMs is choosing the tool. Instead, we introduce a framework for tools more broadly which guides us to explore a model’s ability to detect “silent” tool errors, and reflect on how to plan. This more directly aligns with the increasingly popular use of models as tools. We provide an initial approach to failure recovery with promising results both on a controlled calculator setting and embodied agent planning.

2022

pdf bib
Don’t Copy the Teacher: Data and Model Challenges in Embodied Dialogue
So Yeon Min | Hao Zhu | Ruslan Salakhutdinov | Yonatan Bisk
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Embodied dialogue instruction following requires an agent to complete a complex sequence of tasks from a natural language exchange. The recent introduction of benchmarks raises the question of how best to train and evaluate models for this multi-turn, multi-agent, long-horizon task. This paper contributes to that conversation, by arguing that imitation learning (IL) and related low-level metrics are actually misleading and do not align with the goals of embodied dialogue research and may hinder progress.We provide empirical comparisons of metrics, analysis of three models, and make suggestions for how the field might best progress. First, we observe that models trained with IL take spurious actions during evaluation. Second, we find that existing models fail to ground query utterances, which are essential for task completion. Third, we argue evaluation should focus on higher-level semantic goals. We will release code to additionally filter the data and benchmark models for improved evaluation.

2020

pdf bib
Entity-Enriched Neural Models for Clinical Question Answering
Bhanu Pratap Singh Rawat | Wei-Hung Weng | So Yeon Min | Preethi Raghavan | Peter Szolovits
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

We explore state-of-the-art neural models for question answering on electronic medical records and improve their ability to generalize better on previously unseen (paraphrased) questions at test time. We enable this by learning to predict logical forms as an auxiliary task along with the main task of answer span detection. The predicted logical forms also serve as a rationale for the answer. Further, we also incorporate medical entity information in these models via the ERNIE architecture. We train our models on the large-scale emrQA dataset and observe that our multi-task entity-enriched models generalize to paraphrased questions ~5% better than the baseline BERT model.

pdf bib
Advancing Seq2seq with Joint Paraphrase Learning
So Yeon Min | Preethi Raghavan | Peter Szolovits
Proceedings of the 3rd Clinical Natural Language Processing Workshop

We address the problem of model generalization for sequence to sequence (seq2seq) architectures. We propose going beyond data augmentation via paraphrase-optimized multi-task learning and observe that it is useful in correctly handling unseen sentential paraphrases as inputs. Our models greatly outperform SOTA seq2seq models for semantic parsing on diverse domains (Overnight - up to 3.2% and emrQA - 7%) and Nematus, the winning solution for WMT 2017, for Czech to English translation (CzENG 1.6 - 1.5 BLEU).