Keerthiram Murugesan


2024

pdf bib
EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning
Kinjal Basu | Keerthiram Murugesan | Subhajit Chaudhury | Murray Campbell | Kartik Talamadupula | Tim Klinger
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Text-based games (TBGs) have emerged as an important collection of NLP tasks, requiring reinforcement learning (RL) agents to combine natural language understanding with reasoning. A key challenge for agents attempting to solve such tasks is to generalize across multiple games and demonstrate good performance on both seen and unseen objects. Purely deep-RL-based approaches may perform well on seen objects; however, they fail to showcase the same performance on unseen objects. Commonsense-infused deep-RL agents may work better on unseen data; unfortunately, their policies are often not interpretable or easily transferable. To tackle these issues, in this paper, we present EXPLORER which is an exploration-guided reasoning agent for textual reinforcement learning. EXPLORER is neuro-symbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation. It can also learn generalized symbolic policies and perform well over unseen data. Our experiments show that EXPLORER outperforms the baseline agents on Text-World cooking (TW-Cooking) and Text-World Commonsense (TWC) games.

2023

pdf bib
Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning
Subhajit Chaudhury | Sarathkrishna Swaminathan | Daiki Kimura | Prithviraj Sen | Keerthiram Murugesan | Rosario Uceda-Sosa | Michiaki Tatsubori | Achille Fokoue | Pavan Kapanipathi | Asim Munawar | Alexander Gray
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. This is because of their advantages ranging from inherent interpretability, the lesser requirement of training data, and being generalizable in scenarios with unseen data. Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. Our experiments on established text-based game benchmarks show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better generalization to unseen test games and learning from fewer training interactions.

pdf bib
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types
Keerthiram Murugesan | Sarathkrishna Swaminathan | Soham Dan | Subhajit Chaudhury | Chulaka Gunasekara | Maxwell Crouse | Diwakar Mahajan | Ibrahim Abdelaziz | Achille Fokoue | Pavan Kapanipathi | Salim Roukos | Alexander Gray
Findings of the Association for Computational Linguistics: ACL 2023

With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model human judgments in 7 NLP tasks, based on the fine-grained mismatches between a pair of texts. Inspired by the recent efforts in several NLP tasks for fine-grained evaluation, we introduce a set of 13 mismatch error types such as spatial/geographic errors, entity errors, etc, to guide the model for better prediction of human judgments. We propose a neural framework for evaluating machine texts that uses these mismatch error types as auxiliary tasks and re-purposes the existing single-number evaluation metrics as additional scalar features, in addition to textual features extracted from the machine and reference texts. Our experiments reveal key insights about the existing metrics via the mismatch errors. We show that the mismatch errors between the sentence pairs on the held-out datasets from 7 NLP tasks align well with the human evaluation.

2022

pdf bib
X-FACTOR: A Cross-metric Evaluation of Factual Correctness in Abstractive Summarization
Subhajit Chaudhury | Sarathkrishna Swaminathan | Chulaka Gunasekara | Maxwell Crouse | Srinivas Ravishankar | Daiki Kimura | Keerthiram Murugesan | Ramón Fernandez Astudillo | Tahira Naseem | Pavan Kapanipathi | Alexander Gray
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Abstractive summarization models often produce factually inconsistent summaries that are not supported by the original article. Recently, a number of fact-consistent evaluation techniques have been proposed to address this issue; however, a detailed analysis of how these metrics agree with one another has yet to be conducted. In this paper, we present X-FACTOR, a cross-evaluation of three high-performing fact-aware abstractive summarization methods. First, we show that summarization models are often fine-tuned on datasets that contain factually inconsistent summaries and propose a fact-aware filtering mechanism that improves the quality of training data and, consequently, the factuality of these models. Second, we propose a corrector module that can be used to improve the factual consistency of generated summaries. Third, we present a re-ranking technique that samples summary instances from the output distribution of a summarization model and re-ranks the sampled instances based on their factuality. Finally, we provide a detailed cross-metric agreement analysis that shows how tuning a model to output summaries based on a particular factuality metric influences factuality as determined by the other metrics. Our goal in this work is to facilitate research that improves the factuality and faithfulness of abstractive summarization models.

2021

pdf bib
Efficient Text-based Reinforcement Learning by Jointly Leveraging State and Commonsense Graph Representations
Keerthiram Murugesan | Mattia Atzeni | Pavan Kapanipathi | Kartik Talamadupula | Mrinmaya Sachan | Murray Campbell
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Text-based games (TBGs) have emerged as useful benchmarks for evaluating progress at the intersection of grounded language understanding and reinforcement learning (RL). Recent work has proposed the use of external knowledge to improve the efficiency of RL agents for TBGs. In this paper, we posit that to act efficiently in TBGs, an agent must be able to track the state of the game while retrieving and using relevant commonsense knowledge. Thus, we propose an agent for TBGs that induces a graph representation of the game state and jointly grounds it with a graph of commonsense knowledge from ConceptNet. This combination is achieved through bidirectional knowledge graph attention between the two symbolic representations. We show that agents that incorporate commonsense into the game state graph outperform baseline agents.