Md Mosharaf Hossain


2024

pdf bib
HalluMeasure: Fine-grained Hallucination Measurement Using Chain-of-Thought Reasoning
Shayan Ali Akbar | Md Mosharaf Hossain | Tess Wood | Si-Chi Chin | Erica M Salinas | Victor Alvarez | Erwin Cornejo
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Automating the measurement of hallucinations in LLM generated responses is a challenging task as it requires careful investigation of each factual claim in a response. In this paper, we introduce HalluMeasure, a new LLM-based hallucination detection mechanism that decomposes an LLM response into atomic claims, and evaluates each atomic claim against the provided reference context. The model uses a step-by-step reasoning process called Chain-of-Thought and can identify 3 major categories of hallucinations (e.g., contradiction) as well as 10 more specific subtypes (e.g., overgeneralization) which help to identify reasons behind the hallucination errors. Specifically, we explore four different configurations for HalluMeasure’s classifier: with and without CoT prompting, and using a single classifier call to classify all claims versus separate calls for each claim. The best-performing configuration (with CoT and separate calls for each claim) demonstrates significant improvements in detecting hallucinations, achieving a 10-point increase in F1 score on our TechNewsSumm dataset, and a 3-point increase in AUC ROC on the SummEval dataset, compared to three baseline models (RefChecker, AlignScore, and Vectara HHEM). We further show reasonable accuracy on detecting 10 novel error subtypes of hallucinations (where even humans struggle in classification) derived from linguistic analysis of the errors made by the LLMs.

2022

pdf bib
An Analysis of Negation in Natural Language Understanding Corpora
Md Mosharaf Hossain | Dhivya Chinnappa | Eduardo Blanco
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. We show that these corpora have few negations compared to general-purpose English, and that the few negations in them are often unimportant. Indeed, one can often ignore negations and still make the right predictions. Additionally, experimental results show that state-of-the-art transformers trained with these corpora obtain substantially worse results with instances that contain negation, especially if the negations are important. We conclude that new corpora accounting for negation are needed to solve natural language understanding tasks when negation is present.

pdf bib
Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding
Md Mosharaf Hossain | Eduardo Blanco
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Negation poses a challenge in many natural language understanding tasks. Inspired by the fact that understanding a negated statement often requires humans to infer affirmative interpretations, in this paper we show that doing so benefits models for three natural language understanding tasks. We present an automated procedure to collect pairs of sentences with negation and their affirmative interpretations, resulting in over 150,000 pairs. Experimental results show that leveraging these pairs helps (a) T5 generate affirmative interpretations from negations in a previous benchmark, and (b) a RoBERTa-based classifier solve the task of natural language inference. We also leverage our pairs to build a plug-and-play neural generator that given a negated statement generates an affirmative interpretation. Then, we incorporate the pretrained generator into a RoBERTa-based classifier for sentiment analysis and show that doing so improves the results. Crucially, our proposal does not require any manual effort.

pdf bib
A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations
Md Mosharaf Hossain | Luke Holman | Anusha Kakileti | Tiffany Kao | Nathan Brito | Aaron Mathews | Eduardo Blanco
Findings of the Association for Computational Linguistics: NAACL 2022

This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions % converted for 4,000 for the 3,001 negations that convey an affirmative interpretation. We first cast the problem of revealing affirmative interpretations from negations as a natural language inference (NLI) classification task. Experimental results show that state-of-the-art transformers trained with existing NLI corpora are insufficient to reveal affirmative interpretations. We also observe, however, that fine-tuning brings substantial improvements. In addition to NLI classification, we also explore the more realistic task of generating affirmative interpretations directly from negations with the T5 transformer. We conclude that the generation task remains a challenge as T5 substantially underperforms humans.

2020

pdf bib
Predicting the Focus of Negation: Model and Error Analysis
Md Mosharaf Hossain | Kathleen Hamilton | Alexis Palmer | Eduardo Blanco
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The focus of a negation is the set of tokens intended to be negated, and a key component for revealing affirmative alternatives to negated utterances. In this paper, we experiment with neural networks to predict the focus of negation. Our main novelty is leveraging a scope detector to introduce the scope of negation as an additional input to the network. Experimental results show that doing so obtains the best results to date. Additionally, we perform a detailed error analysis providing insights into the main error categories, and analyze errors depending on whether the model takes into account scope and context information.

pdf bib
It’s not a Non-Issue: Negation as a Source of Error in Machine Translation
Md Mosharaf Hossain | Antonios Anastasopoulos | Eduardo Blanco | Alexis Palmer
Findings of the Association for Computational Linguistics: EMNLP 2020

As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%. We also provide a linguistically motivated analysis that directly explains the majority of our findings. We release our annotations and code to replicate our analysis here: https://github.com/mosharafhossain/negation-mt.

pdf bib
An Analysis of Natural Language Inference Benchmarks through the Lens of Negation
Md Mosharaf Hossain | Venelin Kovatchev | Pranoy Dutta | Tiffany Kao | Elizabeth Wei | Eduardo Blanco
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Negation is underrepresented in existing natural language inference benchmarks. Additionally, one can often ignore the few negations in existing benchmarks and still make the right inference judgments. In this paper, we present a new benchmark for natural language inference in which negation plays a critical role. We also show that state-of-the-art transformers struggle making inference judgments with the new pairs.