Amar Prakash Azad
2026
COMPACT: Building Compliance Paralegals via Clause Graph Reasoning over Contracts
Ayush Singh | Dishank Aggarwal | Pranav Bhagat | Ainulla Khan | Sameer Malik | Amar Prakash Azad
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Ayush Singh | Dishank Aggarwal | Pranav Bhagat | Ainulla Khan | Sameer Malik | Amar Prakash Azad
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Contract compliance verification requires reasoning about cross-clause dependencies where obligations, exceptions, and conditions interact across multiple provisions, yet existing legal NLP benchmarks like ContractNLI and CUAD focus exclusively on isolated single-clause tasks. We introduce COMPACT (COMpliance PAralegals via Clause graph reasoning over conTracts), a framework that models cross-clause dependencies through structured clause graphs. Our approach extracts deontic-temporal entities from clauses and constructs typed relationship graphs capturing definitional dependencies, exception hierarchies, and temporal sequences. From these graphs, we introduce ACE (Assessing Compliance in Enterprise)- a benchmark containing 4,700 carefully constructed compliance scenarios derived from 633 real-world contracts covering 26 types of agreements. Each scenario requires multi-hop reasoning across multiple clauses, and undergoes independent LLM-based validation to ensure quality. Evaluation reveals that multi-clause reasoning poses a fundamental challenge for state-of-the-art models (34-57% base accuracy), while training on ACE yields substantial improvements on compliance tasks (+22–43 % points) and also enhances general legal reasoning performance on other benchmarks (PrivaCI-Bench, ContractNLI).
2023
KITLM: Domain-Specific Knowledge InTegration into Language Models for Question Answering
Ankush Agarwal | Sakharam Gawade | Amar Prakash Azad | Pushpak Bhattacharyya
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
Ankush Agarwal | Sakharam Gawade | Amar Prakash Azad | Pushpak Bhattacharyya
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
Large language models (LLMs) have demon- strated remarkable performance in a wide range of natural language tasks. However, as these models continue to grow in size, they face sig- nificant challenges in terms of computational costs. Additionally, LLMs often lack efficient domain-specific understanding, which is par- ticularly crucial in specialized fields such as aviation and healthcare. To boost the domain- specific understanding, we propose, KITLM 1 , a novel knowledge base integration approach into language model through relevant informa- tion infusion. By integrating pertinent knowl- edge, not only the performance of the lan- guage model is greatly enhanced, but the model size requirement is also significantly reduced while achieving comparable performance. Our proposed knowledge-infused model surpasses the performance of both GPT-3.5-turbo and the state-of-the-art knowledge infusion method, SKILL, achieving over 1.5 times improvement in exact match scores on the MetaQA. KITLM showed a similar performance boost in the avi- ation domain with AeroQA. The drastic perfor- mance improvement of KITLM over the exist- ing methods can be attributed to the infusion of relevant knowledge while mitigating noise. In addition, we release two curated datasets to accelerate knowledge infusion research in specialized fields: a) AeroQA, a new bench- mark dataset designed for multi-hop question- answering within the aviation domain, and b) Aviation Corpus, a dataset constructed from unstructured text extracted from the National Transportation Safety Board reports. Our re- search contributes to advancing the field of domain-specific language understanding and showcases the potential of knowledge infusion techniques in improving the performance.
A Study of Multilingual versus Meta-Learning for Language Model Pre-Training for Adaptation to Unseen Low Resource Languages
Jyotsana Khatri | Rudra Murthy | Amar Prakash Azad | Pushpak Bhattacharyya
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Jyotsana Khatri | Rudra Murthy | Amar Prakash Azad | Pushpak Bhattacharyya
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
In this paper, we compare two approaches to train a multilingual language model: (i) simple multilingual learning using data-mixing, and (ii) meta-learning. We examine the performance of these models by extending them to unseen language pairs and further finetune them for the task of unsupervised NMT. We perform several experiments with varying amounts of data and give a comparative analysis of the approaches. We observe that both approaches give a comparable performance, and meta-learning gives slightly better results in a few cases of low amounts of data. For Oriya-Punjabi language pair, meta-learning performs better than multilingual learning when using 2M, and 3M sentences.
2022
Let the CAT out of the bag: Contrastive Attributed explanations for Text
Saneem Chemmengath | Amar Prakash Azad | Ronny Luss | Amit Dhurandhar
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Saneem Chemmengath | Amar Prakash Azad | Ronny Luss | Amit Dhurandhar
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Contrastive explanations for understanding the behavior of black box models has gained a lot of attention recently as they provide potential for recourse. In this paper, we propose a method Contrastive Attributed explanations for Text (CAT) which provides contrastive explanations for natural language text data with a novel twist as we build and exploit attribute classifiers leading to more semantically meaningful explanations. To ensure that our contrastive generated text has the fewest possible edits with respect to the original text, while also being fluent and close to a human generated contrastive, we resort to a minimal perturbation approach regularized using a BERT language model and attribute classifiers trained on available attributes. We show through qualitative examples and a user study that our method not only conveys more insight because of these attributes, but also leads to better quality (contrastive) text. Quantitatively, we show that our method outperforms other state-of-the-art methods across four data sets on four benchmark metrics.