Benyamin Ahmadnia

2025

pdf bib abs
Multilingual Pre-training Meets Supervised Neural Machine Translation: A Reproducible Evaluation on English–French and Finnish Translation
Benyamin Ahmadnia | Yeswanth Soma | Hossein Sarrafzadeh
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

This paper presents a comparative evaluation of Transformer-based Neural Machine Translation (NMT) models and pre-trained multilingual sequence-to-sequence models in the context of moderately-resourced MT. Using English-French (high-resource) and English-Finnish (moderate-resource) as case studies, we assess the effectiveness of fine-tuning the mBART model versus training standard NMT systems from scratch. Our experiments incorporate data-augmentation techniques such as back-translation and evaluate translation quality using BLEU, TER, METEOR, and COMET metrics. We also provide a detailed error analysis that covers lexical choice, named entity handling, and word order. While mBART demonstrates consistent improvements over classical NMT, particularly in handling complex linguistic structures and sparse training data, we acknowledge the challenges of deploying large models in resource-constrained settings. Our findings highlight practical trade-offs between model complexity, resource availability, and translation quality in multilingual scenarios.

pdf bib abs
Advancing Clinical Translation in Nepali through Fine-Tuned Multilingual Models
Benyamin Ahmadnia | Sumaiya Shaikh | Bibek Poudel | Shazan Mohammed | Sahar Hooshmand
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Low-resource Neural Machine Translation (NMT) remains a major challenge, particularly in high-stakes domains such as healthcare. This paper presents a domain-adapted pipeline for English-Nepali medical translation leveraging two state-of-the-art multilingual Large Language Models (LLMs): mBART and NLLB-200. A high-quality, domain-specific parallel corpus is curated, and both models are fine-tuned using PyTorch frameworks. Translation fidelity is assessed through a multi-metric evaluation strategy that combines BLEU, CHRF++, METEOR, BERTScore, COMET, and perplexity. Our experimental results show that NLLB-200 consistently outperforms mBART across surface-level and semantic metrics, achieving higher accuracy and lower hallucination rates in clinical settings. In addition, error profiling and ethical assessments are conducted to highlight challenges such as term omissions and cultural bias. This work underscores the viability of large-scale multilingual models in enhancing medical translation for low-resource languages and proposes actionable paths toward safer and more equitable MT deployment in healthcare.

pdf bib abs
Multi-Agent Reinforcement Learning for Interactive Code Debugging with Human Feedback and Memory
Anjana Krishnamoorthy | Kartik Ivatury | Benyamin Ahmadnia
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

This paper introduces an interactive Python debugging framework that combines multi-agent reinforcement learning, Natural Language Processing (NLP), and long-term memory. Two Proximal Policy Optimization (PPO) agents specialize in syntax and logic errors, generating candidate fixes that developers can accept, reject, or refine. A BERT-based module encodes natural language feedback into dense embeddings and quality scores, which shape reward signals for Reinforcement Learning from Human Feedback (RLHF). To support personalization, the system uses dual FAISS indices to retrieve past fixes based on code-error pairs and developer explanations. Evaluated on a synthetic dataset of 200 Python programs, our approach achieves an 88% syntax-fix rate and 45% logic-fix rate within five suggestions—outperforming one-shot Large Language Model (LLM) baselines. In addition, the system improves the quality of the explanation, as measured by BLEU, ROUGE, and CodeBLEU. By integrating multi-agent specialization, linguistic feedback, and memory-driven retrieval, our framework delivers a more efficient, adaptive, and developer-aligned debugging experience.

2023

pdf bib abs
Multimodal Learning for Accurate Visual Question Answering: An Attention-Based Approach
Jishnu Bhardwaj | Anurag Balakrishnan | Satyam Pathak | Ishan Unnarkar | Aniruddha Gawande | Benyamin Ahmadnia
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

This paper proposes an open-ended task for Visual Question Answering (VQA) that leverages the InceptionV3 Object Detection model and an attention-based Long Short-Term Memory (LSTM) network for question answering. Our proposed model provides accurate natural language answers to questions about an image, including those that require understanding contextual information and background details. Our findings demonstrate that the proposed approach can achieve high accuracy, even with complex and varied visual information. The proposed method can contribute to developing more advanced vision systems that can process and interpret visual information like humans.

pdf bib abs
Uncertainty Quantification of Text Classification in a Multi-Label Setting for Risk-Sensitive Systems
Jinha Hwang | Carol Gudumotu | Benyamin Ahmadnia
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

This paper addresses the challenge of uncertainty quantification in text classification for medical purposes and provides a three-fold approach to support robust and trustworthy decision-making by medical practitioners. Also, we address the challenge of imbalanced datasets in the medical domain by utilizing the Mondrian Conformal Predictor with a Naïve Bayes classifier.

pdf bib abs
Sign Language Recognition and Translation: A Multi-Modal Approach Using Computer Vision and Natural Language Processing
Jacky Li | Jaren Gerdes | James Gojit | Austin Tao | Samyak Katke | Kate Nguyen | Benyamin Ahmadnia
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Sign-to-Text (S2T) is a hand gesture recognition program in the American Sign Language (ASL) domain. The primary objective of S2T is to classify standard ASL alphabets and custom signs and convert the classifications into a stream of text using neural networks. This paper addresses the shortcomings of pure Computer Vision techniques and applies Natural Language Processing (NLP) as an additional layer of complexity to increase S2T’s robustness.

pdf bib abs
Systematic TextRank Optimization in Extractive Summarization
Morris Zieve | Anthony Gregor | Frederik Juul Stokbaek | Hunter Lewis | Ellis Marie Mendoza | Benyamin Ahmadnia
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

With the ever-growing amount of textual data, extractive summarization has become increasingly crucial for efficiently processing information. The TextRank algorithm, a popular unsupervised method, offers excellent potential for this task. In this paper, we aim to optimize the performance of TextRank by systematically exploring and verifying the best preprocessing and fine-tuning techniques. We extensively evaluate text preprocessing methods, such as tokenization, stemming, and stopword removal, to identify the most effective combination with TextRank. Additionally, we examine fine-tuning strategies, including parameter optimization and incorporation of domain-specific knowledge, to achieve superior summarization quality.

2020

pdf bib abs
An Effective Optimization Method for Neural Machine Translation: The Case of English-Persian Bilingually Low-Resource Scenario
Benyamin Ahmadnia | Raul Aranovich
Proceedings of the 7th Workshop on Asian Translation

In this paper, we propose a useful optimization method for low-resource Neural Machine Translation (NMT) by investigating the effectiveness of multiple neural network optimization algorithms. Our results confirm that applying the proposed optimization method on English-Persian translation can exceed translation quality compared to the English-Persian Statistical Machine Translation (SMT) paradigm.

2019

pdf bib abs
Bilingual Low-Resource Neural Machine Translation with Round-Tripping: The Case of Persian-Spanish
Benyamin Ahmadnia | Bonnie Dorr
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

The quality of Neural Machine Translation (NMT), as a data-driven approach, massively depends on quantity, quality, and relevance of the training dataset. Such approaches have achieved promising results for bilingually high-resource scenarios but are inadequate for low-resource conditions. This paper describes a round-trip training approach to bilingual low-resource NMT that takes advantage of monolingual datasets to address training data scarcity, thus augmenting translation quality. We conduct detailed experiments on Persian-Spanish as a bilingually low-resource scenario. Experimental results demonstrate that this competitive approach outperforms the baselines.

pdf bib abs
Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network
Benyamin Ahmadnia | Bonnie Dorr
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Phrases play a key role in Machine Translation (MT). In this paper, we apply a Long Short-Term Memory (LSTM) model over conventional Phrase-Based Statistical MT (PBSMT). The core idea is to use an LSTM encoder-decoder to score the phrase table generated by the PBSMT decoder. Given a source sequence, the encoder and decoder are jointly trained in order to maximize the conditional probability of a target sequence. Analytically, the performance of a PBSMT system is enhanced by using the conditional probabilities of phrase pairs computed by an LSTM encoder-decoder as an additional feature in the existing log-linear model. We compare the performance of the phrase tables in the PBSMT to the performance of the proposed LSTM and observe its positive impact on translation quality. We construct a PBSMT model using the Moses decoder and enrich the Language Model (LM) utilizing an external dataset. We then rank the phrase tables using an LSTM-based encoder-decoder. This method produces a gain of up to 3.14 BLEU score on the test set.

2017

pdf bib abs
Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language
Benyamin Ahmadnia | Javier Serrano | Gholamreza Haffari
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

This paper is an attempt to exclusively focus on investigating the pivot language technique in which a bridging language is utilized to increase the quality of the Persian-Spanish low-resource Statistical Machine Translation (SMT). In this case, English is used as the bridging language, and the Persian-English SMT is combined with the English-Spanish one, where the relatively large corpora of each may be used in support of the Persian-Spanish pairing. Our results indicate that the pivot language technique outperforms the direct SMT processes currently in use between Persian and Spanish. Furthermore, we investigate the sentence translation pivot strategy and the phrase translation in turn, and demonstrate that, in the context of the Persian-Spanish SMT system, the phrase-level pivoting outperforms the sentence-level pivoting. Finally we suggest a method called combination model in which the standard direct model and the best triangulation pivoting model are blended in order to reach a high-quality translation.

Co-authors

Venues

Fix author