Amit Sheth


pdf bib
FACTIFY-5WQA: 5W Aspect-based Fact Verification through Question Answering
Anku Rani | S.M Towhidul Islam Tonmoy | Dwip Dalal | Shreya Gautam | Megha Chakraborty | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automatic fact verification has received significant attention recently. Contemporary automatic fact-checking systems focus on estimating truthfulness using numerical scores which are not human-interpretable. A human fact-checker generally follows several logical steps to verify a verisimilitude claim and conclude whether it’s truthful or a mere masquerade. Popular fact-checking websites follow a common structure for fact categorization such as half true, half false, false, pants on fire, etc. Therefore, it is necessary to have an aspect-based (delineating which part(s) are true and which are false) explainable system that can assist human fact-checkers in asking relevant questions related to a fact, which can then be validated separately to reach a final verdict. In this paper, we propose a 5W framework (who, what, when, where, and why) for question-answer-based fact explainability. To that end, we present a semi-automatically generated dataset called FACTIFY-5WQA, which consists of 391, 041 facts along with relevant 5W QAs – underscoring our major contribution to this paper. A semantic role labeling system has been utilized to locate 5Ws, which generates QA pairs for claims using a masked language model. Finally, we report a baseline QA system to automatically locate those answers from evidence documents, which can serve as a baseline for future research in the field. Lastly, we propose a robust fact verification system that takes paraphrased claims and automatically validates them. The dataset and the baseline model are available at https: //

pdf bib
ANALOGICAL - A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models
Thilini Wijesiriwardene | Ruwan Wickramarachchi | Bimal Gajera | Shreeyash Gowaikar | Chandan Gupta | Aman Chadha | Aishwarya Naresh Reganti | Amit Sheth | Amitava Das
Findings of the Association for Computational Linguistics: ACL 2023

Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity – (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.


pdf bib
Proceedings of the Fifth International Workshop on Emoji Understanding and Applications in Social Media
Sanjaya Wijeratne | Jennifer Lee | Horacio Saggion | Amit Sheth
Proceedings of the Fifth International Workshop on Emoji Understanding and Applications in Social Media

pdf bib
Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts
Shrey Gupta | Anmol Agarwal | Manas Gaur | Kaushik Roy | Vignesh Narayanan | Ponnurangam Kumaraguru | Amit Sheth
Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology

Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services (e.g., cognitive behavioral therapy) to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of ‘depression’, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user’s initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user’s post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation (with minimal hallucination) suitable for aiding triaging. The dataset created as a part of this research can be obtained from

pdf bib
Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks
Goonmeet Bajaj | Vinh Nguyen | Thilini Wijesiriwardene | Hong Yung Yip | Vishesh Javangula | Amit Sheth | Srinivasan Parthasarathy | Olivier Bodenreider
Proceedings of the Third Workshop on Insights from Negative Results in NLP

Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonym prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Finally, we conduct a thorough grid search, which prior work lacks, to find the best set of hyperparameters. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still out perform the Siamese Networks initialized with embedding extracted from biomedical BERT model.


pdf bib
Identifying Depressive Symptoms from Tweets: Figurative Language Enabled Multitask Learning Framework
Shweta Yadav | Jainish Chauhan | Joy Prakash Sain | Krishnaprasad Thirunarayan | Amit Sheth | Jeremiah Schumm
Proceedings of the 28th International Conference on Computational Linguistics

Existing studies on using social media for deriving mental health status of users focus on the depression detection task. However, for case management and referral to psychiatrists, health-care workers require practical and scalable depressive disorder screening and triage system. This study aims to design and evaluate a decision support system (DSS) to reliably determine the depressive triage level by capturing fine-grained depressive symptoms expressed in user tweets through the emulation of the Patient Health Questionnaire-9 (PHQ-9) that is routinely used in clinical practice. The reliable detection of depressive symptoms from tweets is challenging because the 280-character limit on tweets incentivizes the use of creative artifacts in the utterances and figurative usage contributes to effective expression. We propose a novel BERT based robust multi-task learning framework to accurately identify the depressive symptoms using the auxiliary task of figurative usage detection. Specifically, our proposed novel task sharing mechanism,co-task aware attention, enables automatic selection of optimal information across the BERT lay-ers and tasks by soft-sharing of parameters. Our results show that modeling figurative usage can demonstrably improve the model’s robustness and reliability for distinguishing the depression symptoms.

pdf bib
Medical Knowledge-enriched Textual Entailment Framework
Shweta Yadav | Vishal Pallagani | Amit Sheth
Proceedings of the 28th International Conference on Computational Linguistics

One of the cardinal tasks in achieving robust medical question answering systems is textual entailment. The existing approaches make use of an ensemble of pre-trained language models or data augmentation, often to clock higher numbers on the validation metrics. However, two major shortcomings impede higher success in identifying entailment: (1) understanding the focus/intent of the question and (2) ability to utilize the real-world background knowledge to capture the con-text beyond the sentence. In this paper, we present a novel Medical Knowledge-Enriched Textual Entailment framework that allows the model to acquire a semantic and global representation of the input medical text with the help of a relevant domain-specific knowledge graph. We evaluate our framework on the benchmark MEDIQA-RQE dataset and manifest that the use of knowledge-enriched dual-encoding mechanism help in achieving an absolute improvement of 8.27% over SOTA language models.


pdf bib
A Practical Incremental Learning Framework For Sparse Entity Extraction
Hussein Al-Olimat | Steven Gustafson | Jason Mackay | Krishnaprasad Thirunarayan | Amit Sheth
Proceedings of the 27th International Conference on Computational Linguistics

This work addresses challenges arising from extracting entities from textual data, including the high cost of data annotation, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation. We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as feedback. This incremental and interactive learning framework allows for rapid annotation and subsequent extraction of sparse data while maintaining high accuracy. We evaluate our framework on three publicly available datasets and show that it drastically reduces the cost of sparse entity annotation by an average of 85% and 45% to reach 0.9 and 1.0 F-Scores respectively. Moreover, the method exhibited robust performance across all datasets.

pdf bib
Location Name Extraction from Targeted Text Streams using Gazetteer-based Statistical Language Models
Hussein Al-Olimat | Krishnaprasad Thirunarayan | Valerie Shalin | Amit Sheth
Proceedings of the 27th International Conference on Computational Linguistics

Extracting location names from informal and unstructured social media data requires the identification of referent boundaries and partitioning compound names. Variability, particularly systematic variability in location names (Carroll, 1983), challenges the identification task. Some of this variability can be anticipated as operations within a statistical language model, in this case drawn from gazetteers such as OpenStreetMap (OSM), Geonames, and DBpedia. This permits evaluation of an observed n-gram in Twitter targeted text as a legitimate location name variant from the same location-context. Using n-gram statistics and location-related dictionaries, our Location Name Extraction tool (LNEx) handles abbreviations and automatically filters and augments the location names in gazetteers (handling name contractions and auxiliary contents) to help detect the boundaries of multi-word location names and thereby delimit them in texts. We evaluated our approach on 4,500 event-specific tweets from three targeted streams to compare the performance of LNEx against that of ten state-of-the-art taggers that rely on standard semantic, syntactic and/or orthographic features. LNEx improved the average F-Score by 33-179%, outperforming all taggers. Further, LNEx is capable of stream processing.

pdf bib
Multi-Task Learning Framework for Mining Crowd Intelligence towards Clinical Treatment
Shweta Yadav | Asif Ekbal | Sriparna Saha | Pushpak Bhattacharyya | Amit Sheth
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

In recent past, social media has emerged as an active platform in the context of healthcare and medicine. In this paper, we present a study where medical user’s opinions on health-related issues are analyzed to capture the medical sentiment at a blog level. The medical sentiments can be studied in various facets such as medical condition, treatment, and medication that characterize the overall health status of the user. Considering these facets, we treat analysis of this information as a multi-task classification problem. In this paper, we adopt a novel adversarial learning approach for our multi-task learning framework to learn the sentiment’s strengths expressed in a medical blog. Our evaluation shows promising results for our target tasks.


pdf bib
Clustering for Simultaneous Extraction of Aspects and Features from Reviews
Lu Chen | Justin Martineau | Doreen Cheng | Amit Sheth
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies


pdf bib
Implicit Entity Recognition in Clinical Documents
Sujan Perera | Pablo Mendes | Amit Sheth | Krishnaprasad Thirunarayan | Adarsh Alex | Christopher Heid | Greg Mott
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics


pdf bib
Active Learning with Efficient Feature Weighting Methods for Improving Data Quality and Classification Accuracy
Justin Martineau | Lu Chen | Doreen Cheng | Amit Sheth
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)