Khin Mar Soe

Also published as: Khin Mar Soe


pdf bib
A Myanmar (Burmese)-English Named Entity Transliteration Dictionary
Aye Myat Mon | Chenchen Ding | Hour Kaing | Khin Mar Soe | Masao Utiyama | Eiichiro Sumita
Proceedings of the 12th Language Resources and Evaluation Conference

Transliteration is generally a phonetically based transcription across different writing systems. It is a crucial task for various downstream natural language processing applications. For the Myanmar (Burmese) language, robust automatic transliteration for borrowed English words is a challenging task because of the complex Myanmar writing system and the lack of data. In this study, we constructed a Myanmar-English named entity dictionary containing more than eighty thousand transliteration instances. The data have been released under a CC BY-NC-SA license. We evaluated the automatic transliteration performance using statistical and neural network-based approaches based on the prepared data. The neural network model outperformed the statistical model significantly in terms of the BLEU score on the character level. Different units used in the Myanmar script for processing were also compared and discussed.


pdf bib
Statistical Machine Translation between Myanmar (Burmese) and Dawei (Tavoyan)
Thazin Myint Oo | Ye Kyaw Thu | Khin Mar Soe | Thepchai Supnithi
Proceedings of The First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers

pdf bib
Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese)
Thazin Myint Oo | Ye Kyaw Thu | Khin Mar Soe
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects

This work explores neural machine translation between Myanmar (Burmese) and Rakhine (Arakanese). Rakhine is a language closely related to Myanmar, often considered a dialect. We implemented three prominent neural machine translation (NMT) systems: recurrent neural networks (RNN), transformer, and convolutional neural networks (CNN). The systems were evaluated on a Myanmar-Rakhine parallel text corpus developed by us. In addition, two types of word segmentation schemes for word embeddings were studied: Word-BPE and Syllable-BPE segmentation. Our experimental results clearly show that the highest quality NMT and statistical machine translation (SMT) performances are obtained with Syllable-BPE segmentation for both types of translations. If we focus on NMT, we find that the transformer with Word-BPE segmentation outperforms CNN and RNN for both Myanmar-Rakhine and Rakhine-Myanmar translation. However, CNN with Syllable-BPE segmentation obtains a higher score than the RNN and transformer.


pdf bib
Myanmar Phrases Translation Model with Morphological Analysis for Statistical Myanmar to English Translation System
Thet Thet Zin | Khin Mar Soe | Ni Lar Thein
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

pdf bib
Developing a Chunk-based Grammar Checker for Translated English Sentences
Nay Yee Lin | Khin Mar Soe | Ni Lar Thein
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation