NECTEC’s Participation in WAT-2021
Zar Zar Hlaing | Ye Kyaw Thu | Thazin Myint Oo | Mya Ei San | Sasiporn Usanavasin | Ponrudee Netisopakul | Thepchai Supnithi
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
In this paper, we report the experimental results of Machine Translation models conducted by a NECTEC team for the translation tasks of WAT-2021. Basically, our models are based on neural methods for both directions of English-Myanmar and Myanmar-English language pairs. Most of the existing Neural Machine Translation (NMT) models mainly focus on the conversion of sequential data and do not directly use syntactic information. However, we conduct multi-source neural machine translation (NMT) models using the multilingual corpora such as string data corpus, tree data corpus, or POS-tagged data corpus. The multi-source translation is an approach to exploit multiple inputs (e.g. in two different formats) to increase translation accuracy. The RNN-based encoder-decoder model with attention mechanism and transformer architectures have been carried out for our experiment. The experimental results showed that the proposed models of RNN-based architecture outperform the baseline model for English-to-Myanmar translation task, and the multi-source and shared-multi-source transformer models yield better translation results than the baseline.
Non-Dictionary-Based Thai Word Segmentation Using Decision Trees
Thanaruk Theeramunkong | Sasiporn Usanavasin
Proceedings of the First International Conference on Human Language Technology Research
- Zar Zar Hlaing 1
- Ye Kyaw Thu 1
- Thazin Myint Oo 1
- Mya Ei San 1
- Ponrudee Netisopakul 1
- show all...