Sabbir Ahmed

2025

bnContextQA: Benchmarking Long-Context Question Answering and Challenges in Bangla
Adnan Ahmad | Labiba Adiba | Namirah Rasul | Md Tahmid Rahman Laskar | Sabbir Ahmed
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)

Large models have advanced in processing long input sequences, but their ability to consistently use information across extended contexts remains a challenge. Recent studies highlight a positional bias where models prioritize information at the beginning or end of the input while neglecting the middle, resulting in a U-shaped performance curve but this was limited to English. Whether this bias is universal or shaped by language-specific factors remains unclear. In this work, we investigate positional bias in Bangla, a widely spoken but computationally underrepresented language. To support this, we introduce a novel Bangla benchmark dataset, bnContextQA, specifically designed for long-context comprehension. The dataset comprises of 350 long-context QA instances, each paired with 30 context paragraphs, allowing controlled evaluation of information retrieval at different positions. Using this dataset, we assess the performance of LLMs on Bangla across varying passage positions, providing insights into cross-linguistic positional effects. The bnContextQA dataset is publicly available at https://github.com/labiba02/bnContextQA.git to support future research on long-context understanding in Bangla and multilingual LLMs.

pdf bib abs

Form-aware Poetic Generation for Bangla
Amina | Abdullah | Mueeze Al Mushabbir | Sabbir Ahmed
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)

Poetry generation in low-resource languages such as Bangla is particularly challenging due to the scarcity of structured poetic corpora and the complexity of its metrical system (matra). We present a structure-aware framework for Bangla poetry generation using pretrained Bangla large language models (LLMs)–TigerLLM, TituLLM, and BanglaT5–trained on general non-poetic text corpora augmented with rich structural control tokens. These tokens capture rhyme, meter, word count, and line boundaries, enabling unsupervised modeling of poetic form without curated poetry datasets. Unlike prior fixed-pattern approaches, our framework introduces variable control compositions, allowing models to generate flexible poetic structures. Experiments show that explicit structural conditioning improves rhyme consistency and metrical balance while maintaining semantic coherence. Our study provides the first systematic evaluation of Bangla LLMs for form-constrained creative generation, offering insights into structural representation in low-resource poetic modeling.

2023

pdf bib abs

Unveiling the Essence of Poetry: Introducing a Comprehensive Dataset and Benchmark for Poem Summarization
Ridwan Mahbub | Ifrad Khan | Samiha Anuva | Md Shihab Shahriar | Md Tahmid Rahman Laskar | Sabbir Ahmed
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

While research in natural language processing has progressed significantly in creative language generation, the question of whether language models can interpret the intended meaning of creative language largely remains unanswered. Poetry as a creative art form has existed for generations, and summarization of such content requires deciphering the figurative patterns to find out the actual intent and message of the poet. This task can provide the researchers an opportunity to evaluate the creative language interpretation capacity of the language models. Unlike typical text, summarization of poems is a challenging task as poems carry a deeper meaning, which can be easily lost if only the literal meaning is considered. That being said, we propose a new task in the field of natural language understanding called ‘Poem Summarization’. As a starting, we propose the first-ever dataset for this task, named ‘PoemSum’, consisting of 3011 samples of poetry and its corresponding summarized interpretation in the English language. We have benchmarked the performance of different state-of-the-art summarization models and provided observations on their limitations. The dataset and all relevant code used in this work have been made publicly available.

pdf bib abs

The ability to identify important entities in a text, known as Named Entity Recognition (NER), is useful in a large variety of downstream tasks in the biomedical domain. This is a considerably difficult task when working with Consumer Health Questions (CHQs), which consist of informal language used in day-to-day life by patients. These difficulties are amplified in the case of Bengali, which allows for a huge amount of flexibility in sentence structures and has significant variances in regional dialects. Unfortunately, the complexity of the language is not accurately reflected in the limited amount of available data, which makes it difficult to build a reliable decision-making system. To address the scarcity of data, this paper presents ‘Bangla-HealthNER’, a comprehensive dataset designed to identify named entities in health-related texts in the Bengali language. It consists of 31,783 samples sourced from a popular online public health platform, which allows it to capture the diverse range of linguistic styles and dialects used by native speakers from various regions in their day-to-day lives. The insight into this diversity in language will prove useful to any medical decision-making systems that are developed for use in real-world applications. To highlight the difficulty of the dataset, it has been benchmarked on state-of-the-art token classification models, where BanglishBERT achieved the highest performance with an F1-score of 56.13 ± 0.75%. The dataset and all relevant code used in this work have been made publicly available.

pdf bib abs

BanglaCHQ-Summ: An Abstractive Summarization Dataset for Medical Queries in Bangla Conversational Speech
Alvi Khan | Fida Kamal | Mohammad Abrar Chowdhury | Tasnim Ahmed | Md Tahmid Rahman Laskar | Sabbir Ahmed
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

Online health consultation is steadily gaining popularity as a platform for patients to discuss their medical health inquiries, known as Consumer Health Questions (CHQs). The emergence of the COVID-19 pandemic has also led to a surge in the use of such platforms, creating a significant burden for the limited number of healthcare professionals attempting to respond to the influx of questions. Abstractive text summarization is a promising solution to this challenge, since shortening CHQs to only the information essential to answering them reduces the amount of time spent parsing unnecessary information. The summarization process can also serve as an intermediate step towards the eventual development of an automated medical question-answering system. This paper presents ‘BanglaCHQ-Summ’, the first CHQ summarization dataset for the Bangla language, consisting of 2,350 question-summary pairs. It is benchmarked on state-of-the-art Bangla and multilingual text generation models, with the best-performing model, BanglaT5, achieving a ROUGE-L score of 48.35%. In addition, we address the limitations of existing automatic metrics for summarization by conducting a human evaluation. The dataset and all relevant code used in this work have been made publicly available.