Vaibhav Kanojia


2025

We present a direction-specialized neural machine translation framework for ultra-low-resource Indic and tribal languages, including Bhili, Gondi, Mundari, and Santali. Using the NLLB-600M backbone, we freeze the multilingual encoder and fine-tune direction-specific decoders to reduce negative transfer and improve morphological fidelity under severe data scarcity. Our system is trained with leakage-safe splits, bitext reversal augmentation, and memory-efficient mixed-precision optimization. On the official MMLoSo 2025 Kaggle benchmark, we achieve a public score of 171.4 and a private score of 161.1, demonstrating stable generalization in highly noisy low-resource conditions.
Search
Co-authors
    Venues
    Fix author