Waisullah Yousofi
2024
Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair
Waisullah Yousofi
|
Pushpak Bhattacharyya
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
This paper demonstrates that Phrase-Based Statistical Machine Translation (PBSMT) can outperform Transformer-based Neural Machine Translation (NMT) in moderate-resource scenarios, specifically for structurally similar languages, Persian-Hindi pair in our case. Despite the Transformer architecture’s typical preference for large parallel corpora, our results show that PBSMT achieves a BLEU score of 66.32, significantly exceeding the Transformer-NMT score of 53.7 ingesting the same dataset.