Linsheng Guo
2023
Pipeline Enabling Zero-shot Classification for Bangla Handwritten Grapheme
Linsheng Guo
|
Md Habibur Sifat
|
Tashin Ahmed
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
This research investigates Zero-Shot Learning (ZSL), and proposes CycleGAN-based image synthesis and accurate label mapping to build a strong association between labels and graphemes. The objective is to enhance model accuracy in detecting unseen classes by employing advanced font image categorization and a CycleGAN-based generator. The resulting representations of abstract character structures demonstrate a significant improvement in recognition, accommodating both seen and unseen classes. This investigation addresses the complex issue of Optical Character Recognition (OCR) in the specific context of the Bangla language. Bangla script is renowned for its intricate nature, consisting of a total of 49 letters, which include 11 vowels, 38 consonants, and 18 diacritics. The combination of letters in this complex arrangement provides the opportunity to create almost 13,000 unique variations of graphemes, which exceeds the number of graphemic units found in the English language. Our investigation presents a new strategy for ZSL in the context of Bangla OCR. This approach combines generative models with careful labeling techniques to enhance the progress of Bangla OCR, specifically focusing on grapheme categorization. Our goal is to make a substantial impact on the digitalization of educational resources in the Indian subcontinent.
Search