Insup Lee


2024

pdf bib
TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction
Shuo Li | Sangdon Park | Insup Lee | Osbert Bastani
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

When applied to open-domain question answering, large language models (LLMs) frequently generate incorrect responses based on made-up facts, which are called hallucinations. Retrieval augmented generation (RAG) is a promising strategy to avoid hallucinations, but it does not provide guarantees on its correctness. To address this challenge, we propose the Trustworthy Retrieval Augmented Question Answering, or *TRAQ*, which provides the first end-to-end statistical correctness guarantee for RAG. TRAQ uses conformal prediction, a statistical technique for constructing prediction sets that are guaranteed to contain the semantically correct response with high probability. Additionally, TRAQ leverages Bayesian optimization to minimize the size of the constructed sets. In an extensive experimental evaluation, we demonstrate that TRAQ provides the desired correctness guarantee while reducing prediction set size by 16.2% on average compared to an ablation. The implementation is available: [https://github.com/shuoli90/TRAQ](https://github.com/shuoli90/TRAQ).

2023

pdf bib
Penn & BGU BabyBERTa+ for Strict-Small BabyLM Challenge
Yahan Yang | Elior Sulem | Insup Lee | Dan Roth
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

pdf bib
Bootstrapping Small & High Performance Language Models with Unmasking-Removal Training Policy
Yahan Yang | Elior Sulem | Insup Lee | Dan Roth
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

BabyBERTa, a language model trained on small-scale child-directed speech while none of the words are unmasked during training, has been shown to achieve a level of grammaticality comparable to that of RoBERTa-base, which is trained on 6,000 times more words and 15 times more parameters. Relying on this promising result, we explore in this paper the performance of BabyBERTa-based models in downstream tasks, focusing on Semantic Role Labeling (SRL) and two Extractive Question Answering tasks, with the aim of building more efficient systems that rely on less data and smaller models. We investigate the influence of these models both alone and as a starting point to larger pre-trained models, separately examining the contribution of the pre-training data, the vocabulary, and the masking policy on the downstream task performance. Our results show that BabyBERTa trained with unmasking-removal policy is a much stronger starting point for downstream tasks compared to the use of RoBERTa masking policy when 10M words are used for training and that this tendency persists, although to a lesser extent, when adding more training data.

pdf bib
In and Out-of-Domain Text Adversarial Robustness via Label Smoothing
Yahan Yang | Soham Dan | Dan Roth | Insup Lee
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recently it has been shown that state-of-the-art NLP models are vulnerable to adversarial attacks, where the predictions of a model can be drastically altered by slight modifications to the input (such as synonym substitutions). While several defense techniques have been proposed, and adapted, to the discrete nature of text adversarial attacks, the benefits of general-purpose regularization methods such as label smoothing for language models, have not been studied. In this paper, we study the adversarial robustness provided by label smoothing strategies in foundational models for diverse NLP tasks in both in-domain and out-of-domain settings. Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks. We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.

2011

pdf bib
Computing Logical Form on Regulatory Texts
Nikhil Dinesh | Aravind Joshi | Insup Lee
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2006

pdf bib
Extracting formal specifications from natural language regulatory documents
Nikhil Dinesh | Aravind Joshi | Insup Lee | Bonnie Webber
Proceedings of the Fifth International Workshop on Inference in Computational Semantics (ICoS-5)