Sriram Venkatapathy
2025
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization
Brihi Joshi | Sriram Venkatapathy | Mohit Bansal | Nanyun Peng | Haw-Shiuan Chang
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Brihi Joshi | Sriram Venkatapathy | Mohit Bansal | Nanyun Peng | Haw-Shiuan Chang
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Evaluating creative text such as human-written stories using language models has always been a challenging task – owing to the subjectivity of multi-annotator ratings. To mimic the thinking process of humans, chain of thought (Wei et al., 2023) (CoT) generates free-text explanations that help guide a model’s predictions and Self-Consistency (Wang et al., 2022) (SC) marginalizes predictions over multiple generated explanations. In this study, we discover that the widely-used self-consistency reasoning methods cause suboptimal results due to an objective mismatch between generating ‘fluent-looking’ explanations vs. actually leading to a good rating prediction for an aspect of a story. To overcome this challenge, we propose Chain-of-Keywords (CoKe), which generates a sequence of keywords before generating a free-text rationale, that guide the rating prediction of our evaluation language model. Then, we generate a diverse set of such keywords, and aggregate the scores corresponding to these generations. On the StoryER dataset, CoKe based on our small fine-tuned evaluation models not only reach human-level performance and significantly outperform GPT-4 with a 2x boost in correlation with human annotators, but also requires drastically less # of parameters.
2023
Adversarial Robustness for Large Language NER models using Disentanglement and Word Attributions
Xiaomeng Jin | Bhanukiran Vinzamuri | Sriram Venkatapathy | Heng Ji | Pradeep Natarajan
Findings of the Association for Computational Linguistics: EMNLP 2023
Xiaomeng Jin | Bhanukiran Vinzamuri | Sriram Venkatapathy | Heng Ji | Pradeep Natarajan
Findings of the Association for Computational Linguistics: EMNLP 2023
Large language models (LLM’s) have been widely used for several applications such as question answering, text classification and clustering. While the preliminary results across the aforementioned tasks looks promising, recent work has dived deep into LLM’s performing poorly for complex Named Entity Recognition (NER) tasks in comparison to fine-tuned pre-trained language models (PLM’s). To enhance wider adoption of LLM’s, our paper investigates the robustness of such LLM NER models and its instruction fine-tuned variants to adversarial attacks. In particular, we propose a novel attack which relies on disentanglement and word attribution techniques where the former aids in learning an embedding capturing both entity and non-entity influences separately, and the latter aids in identifying important words across both components. This is in stark contrast to most techniques which primarily leverage non-entity words for perturbations limiting the space being explored to synthesize effective adversarial examples. Adversarial training results based on our method improves the F1 score over original LLM NER model by 8% and 18% on CoNLL-2003 and Ontonotes 5.0 datasets respectively.
2022
Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection
Stefan Schroedl | Manoj Kumar | Kiana Hajebi | Morteza Ziyadi | Sriram Venkatapathy | Anil Ramakrishna | Rahul Gupta | Pradeep Natarajan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Stefan Schroedl | Manoj Kumar | Kiana Hajebi | Morteza Ziyadi | Sriram Venkatapathy | Anil Ramakrishna | Rahul Gupta | Pradeep Natarajan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
This paper presents an approach to identify samples from live traffic where the customer implicitly communicated satisfaction with Alexa’s responses, by leveraging interpretations of model behavior. Such customer signals are noisy and adding a large number of samples from live traffic to training set makes re-training infeasible. Our work addresses these challenges by identifying a small number of samples that grow training set by ~0.05% while producing statistically significant improvements in both offline and online tests.
Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations
Kuan-Hao Huang | Varun Iyer | Anoop Kumar | Sriram Venkatapathy | Kai-Wei Chang | Aram Galstyan
Findings of the Association for Computational Linguistics: EMNLP 2022
Kuan-Hao Huang | Varun Iyer | Anoop Kumar | Sriram Venkatapathy | Kai-Wei Chang | Aram Galstyan
Findings of the Association for Computational Linguistics: EMNLP 2022
Syntactically controlled paraphrase generation has become an emerging research direction in recent years. Most existing approaches require annotated paraphrase pairs for training and are thus costly to extend to new domains. Unsupervised approaches, on the other hand, do not need paraphrase pairs but suffer from relatively poor performance in terms of syntactic control and quality of generated paraphrases. In this paper, we demonstrate that leveraging Abstract Meaning Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation.Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), separately encodes the AMR graph and the constituency parse of the input sentence into two disentangled semantic and syntactic embeddings. A decoder is then learned to reconstruct the input sentence from the semantic and syntactic embeddings. Our experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches. We also demonstrate that the paraphrases generated by AMRPG can be used for data augmentation to improve the robustness of NLP models.
2020
Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks
Ansel MacLaughlin | Jwala Dhamala | Anoop Kumar | Sriram Venkatapathy | Ragav Venkatesan | Rahul Gupta
Proceedings of the First Workshop on Insights from Negative Results in NLP
Ansel MacLaughlin | Jwala Dhamala | Anoop Kumar | Sriram Venkatapathy | Ragav Venkatesan | Rahul Gupta
Proceedings of the First Workshop on Insights from Negative Results in NLP
Neural Architecture Search (NAS) methods, which automatically learn entire neural model or individual neural cell architectures, have recently achieved competitive or state-of-the-art (SOTA) performance on variety of natural language processing and computer vision tasks, including language modeling, natural language inference, and image classification. In this work, we explore the applicability of a SOTA NAS algorithm, Efficient Neural Architecture Search (ENAS) (Pham et al., 2018) to two sentence pair tasks, paraphrase detection and semantic textual similarity. We use ENAS to perform a micro-level search and learn a task-optimized RNN cell architecture as a drop-in replacement for an LSTM. We explore the effectiveness of ENAS through experiments on three datasets (MRPC, SICK, STS-B), with two different models (ESIM, BiLSTM-Max), and two sets of embeddings (Glove, BERT). In contrast to prior work applying ENAS to NLP tasks, our results are mixed – we find that ENAS architectures sometimes, but not always, outperform LSTMs and perform similarly to random architecture search.
2015
Reversibility reconsidered: finite-state factors for efficient probabilistic sampling in parsing and generation
Marc Dymetman | Sriram Venkatapathy | Chunyang Xiao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Marc Dymetman | Sriram Venkatapathy | Chunyang Xiao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2014
Fast Domain Adaptation of SMT models without in-Domain Parallel Data
Prashant Mathur | Sriram Venkatapathy | Nicola Cancedda
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Prashant Mathur | Sriram Venkatapathy | Nicola Cancedda
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
2013
Confidence-driven Rewriting for Improved Translation
Shachar Mirkin | Sriram Venkatapathy | Marc Dymetman
Proceedings of Machine Translation Summit XIV: Posters
Shachar Mirkin | Sriram Venkatapathy | Marc Dymetman
Proceedings of Machine Translation Summit XIV: Posters
SORT: An Interactive Source-Rewriting Tool for Improved Translation
Shachar Mirkin | Sriram Venkatapathy | Marc Dymetman | Ioan Calapodescu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Shachar Mirkin | Sriram Venkatapathy | Marc Dymetman | Ioan Calapodescu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Investigations in Exact Inference for Hierarchical Translation
Wilker Aziz | Marc Dymetman | Sriram Venkatapathy
Proceedings of the Eighth Workshop on Statistical Machine Translation
Wilker Aziz | Marc Dymetman | Sriram Venkatapathy
Proceedings of the Eighth Workshop on Statistical Machine Translation
2012
An SMT-driven Authoring Tool
Sriram Venkatapathy | Shachar Mirkin
Proceedings of COLING 2012: Demonstration Papers
Sriram Venkatapathy | Shachar Mirkin
Proceedings of COLING 2012: Demonstration Papers
Prediction of Learning Curves in Machine Translation
Prasanth Kolachina | Nicola Cancedda | Marc Dymetman | Sriram Venkatapathy
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Prasanth Kolachina | Nicola Cancedda | Marc Dymetman | Sriram Venkatapathy
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2010
Phrase Based Decoding using a Discriminative Model
Prasanth Kolachina | Sriram Venkatapathy | Srinivas Bangalore | Sudheer Kolachina | Avinesh PVS
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation
Prasanth Kolachina | Sriram Venkatapathy | Srinivas Bangalore | Sudheer Kolachina | Avinesh PVS
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation
A Discriminative Approach for Dependency Based Statistical Machine Translation
Sriram Venkatapathy | Rajeev Sangal | Aravind Joshi | Karthik Gali
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation
Sriram Venkatapathy | Rajeev Sangal | Aravind Joshi | Karthik Gali
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation
2009
Sentence Realisation from Bag of Words with Dependency Constraints
Karthik Gali | Sriram Venkatapathy
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Karthik Gali | Sriram Venkatapathy
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
2007
Detecting Compositionality of Verb-Object Combinations using Selectional Preferences
Diana McCarthy | Sriram Venkatapathy | Aravind Joshi
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Diana McCarthy | Sriram Venkatapathy | Aravind Joshi
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair
Sriram Venkatapathy | Aravind Joshi
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
Sriram Venkatapathy | Aravind Joshi
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
Three models for discriminative machine translation using Global Lexical Selection and Sentence Reconstruction
Sriram Venkatapathy | Srinivas Bangalore
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
Sriram Venkatapathy | Srinivas Bangalore
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
2006
Using Information about Multi-word Expressions for the Word-Alignment Task
Sriram Venkatapathy | Aravind K. Joshi
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Sriram Venkatapathy | Aravind K. Joshi
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
2005
Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features
Sriram Venkatapathy | Aravind Joshi
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
Sriram Venkatapathy | Aravind Joshi
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
Search
Fix author
Co-authors
- Aravind Joshi 6
- Marc Dymetman 5
- Shachar Mirkin 3
- Srinivas Bangalore 2
- Nicola Cancedda 2
- Karthik Gali 2
- Rahul Gupta 2
- Prasanth Kolachina 2
- Anoop Kumar 2
- Pradeep Natarajan 2
- Wilker Aziz 1
- Mohit Bansal 1
- Akshar Bharati 1
- Ioan Calapodescu 1
- Kai-Wei Chang 1
- Haw-Shiuan Chang 1
- Jwala Dhamala 1
- Aram Galstyan 1
- Kiana Hajebi 1
- Kuan - Hao Huang 1
- Varun Iyer 1
- Heng Ji 1
- Xiaomeng Jin 1
- Brihi Joshi 1
- Sudheer Kolachina 1
- Manoj Kumar 1
- Ansel MacLaughlin 1
- Prashanth Mannem 1
- Prashant Mathur 1
- Diana McCarthy 1
- Avinesh PVS 1
- Nanyun Peng 1
- Anil Ramakrishna 1
- Rajeev Sangal 1
- Stefan Schroedl 1
- Ragav Venkatesan 1
- Bhanukiran Vinzamuri 1
- Chunyang Xiao 1
- Morteza Ziyadi 1