Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021
Ye Kyaw Thu, Thazin Myint Oo, Hlaing Myat Nwe, Khaing Zar Mon, Nang Aeindray Kyaw, Naing Linn Phyo, Nann Hwan Khun, Hnin Aye Thant
Abstract
In this paper we describe our submissions to WAT-2021 (Nakazawa et al., 2021) for English-to-Myanmar language (Burmese) task. Our team, ID: “YCC-MT1”, focused on bringing transliteration knowledge to the decoder without changing the model. We manually extracted the transliteration word/phrase pairs from the ALT corpus and applying XML markup feature of Moses decoder (i.e. -xml-input exclusive, -xml-input inclusive). We demonstrate that hybrid translation technique can significantly improve (around 6 BLEU scores) the baseline of three well-known “Phrase-based SMT”, “Operation Sequence Model” and “Hierarchical Phrase-based SMT”. Moreover, this simple hybrid method achieved the second highest results among the submitted MT systems for English-to-Myanmar WAT2021 translation share task according to BLEU (Papineni et al., 2002) and AMFM scores (Banchs et al., 2015).- Anthology ID:
- 2021.wat-1.7
- Volume:
- Proceedings of the 8th Workshop on Asian Translation (WAT2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Toshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 83–89
- Language:
- URL:
- https://aclanthology.org/2021.wat-1.7
- DOI:
- 10.18653/v1/2021.wat-1.7
- Bibkey:
- Cite (ACL):
- Ye Kyaw Thu, Thazin Myint Oo, Hlaing Myat Nwe, Khaing Zar Mon, Nang Aeindray Kyaw, Naing Linn Phyo, Nann Hwan Khun, and Hnin Aye Thant. 2021. Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 83–89, Online. Association for Computational Linguistics.
- Cite (Informal):
- Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021 (Thu et al., WAT 2021)
- Copy Citation:
- PDF:
- https://aclanthology.org/2021.wat-1.7.pdf
Export citation
@inproceedings{thu-etal-2021-hybrid, title = "Hybrid Statistical Machine Translation for {E}nglish-{M}yanmar: {UTYCC} Submission to {WAT}-2021", author = "Thu, Ye Kyaw and Oo, Thazin Myint and Nwe, Hlaing Myat and Mon, Khaing Zar and Kyaw, Nang Aeindray and Phyo, Naing Linn and Khun, Nann Hwan and Thant, Hnin Aye", editor = "Nakazawa, Toshiaki and Nakayama, Hideki and Goto, Isao and Mino, Hideya and Ding, Chenchen and Dabre, Raj and Kunchukuttan, Anoop and Higashiyama, Shohei and Manabe, Hiroshi and Pa, Win Pa and Parida, Shantipriya and Bojar, Ond{\v{r}}ej and Chu, Chenhui and Eriguchi, Akiko and Abe, Kaori and Oda, Yusuke and Sudoh, Katsuhito and Kurohashi, Sadao and Bhattacharyya, Pushpak", booktitle = "Proceedings of the 8th Workshop on Asian Translation (WAT2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.wat-1.7", doi = "10.18653/v1/2021.wat-1.7", pages = "83--89", abstract = "In this paper we describe our submissions to WAT-2021 (Nakazawa et al., 2021) for English-to-Myanmar language (Burmese) task. Our team, ID: {``}YCC-MT1{''}, focused on bringing transliteration knowledge to the decoder without changing the model. We manually extracted the transliteration word/phrase pairs from the ALT corpus and applying XML markup feature of Moses decoder (i.e. -xml-input exclusive, -xml-input inclusive). We demonstrate that hybrid translation technique can significantly improve (around 6 BLEU scores) the baseline of three well-known {``}Phrase-based SMT{''}, {``}Operation Sequence Model{''} and {``}Hierarchical Phrase-based SMT{''}. Moreover, this simple hybrid method achieved the second highest results among the submitted MT systems for English-to-Myanmar WAT2021 translation share task according to BLEU (Papineni et al., 2002) and AMFM scores (Banchs et al., 2015).", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="thu-etal-2021-hybrid"> <titleInfo> <title>Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021</title> </titleInfo> <name type="personal"> <namePart type="given">Ye</namePart> <namePart type="given">Kyaw</namePart> <namePart type="family">Thu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Thazin</namePart> <namePart type="given">Myint</namePart> <namePart type="family">Oo</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hlaing</namePart> <namePart type="given">Myat</namePart> <namePart type="family">Nwe</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Khaing</namePart> <namePart type="given">Zar</namePart> <namePart type="family">Mon</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nang</namePart> <namePart type="given">Aeindray</namePart> <namePart type="family">Kyaw</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Naing</namePart> <namePart type="given">Linn</namePart> <namePart type="family">Phyo</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nann</namePart> <namePart type="given">Hwan</namePart> <namePart type="family">Khun</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hnin</namePart> <namePart type="given">Aye</namePart> <namePart type="family">Thant</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2021-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 8th Workshop on Asian Translation (WAT2021)</title> </titleInfo> <name type="personal"> <namePart type="given">Toshiaki</namePart> <namePart type="family">Nakazawa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hideki</namePart> <namePart type="family">Nakayama</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Isao</namePart> <namePart type="family">Goto</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hideya</namePart> <namePart type="family">Mino</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chenchen</namePart> <namePart type="family">Ding</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Raj</namePart> <namePart type="family">Dabre</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anoop</namePart> <namePart type="family">Kunchukuttan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shohei</namePart> <namePart type="family">Higashiyama</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hiroshi</namePart> <namePart type="family">Manabe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Win</namePart> <namePart type="given">Pa</namePart> <namePart type="family">Pa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shantipriya</namePart> <namePart type="family">Parida</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chenhui</namePart> <namePart type="family">Chu</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Akiko</namePart> <namePart type="family">Eriguchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kaori</namePart> <namePart type="family">Abe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yusuke</namePart> <namePart type="family">Oda</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Katsuhito</namePart> <namePart type="family">Sudoh</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sadao</namePart> <namePart type="family">Kurohashi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Pushpak</namePart> <namePart type="family">Bhattacharyya</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>In this paper we describe our submissions to WAT-2021 (Nakazawa et al., 2021) for English-to-Myanmar language (Burmese) task. Our team, ID: “YCC-MT1”, focused on bringing transliteration knowledge to the decoder without changing the model. We manually extracted the transliteration word/phrase pairs from the ALT corpus and applying XML markup feature of Moses decoder (i.e. -xml-input exclusive, -xml-input inclusive). We demonstrate that hybrid translation technique can significantly improve (around 6 BLEU scores) the baseline of three well-known “Phrase-based SMT”, “Operation Sequence Model” and “Hierarchical Phrase-based SMT”. Moreover, this simple hybrid method achieved the second highest results among the submitted MT systems for English-to-Myanmar WAT2021 translation share task according to BLEU (Papineni et al., 2002) and AMFM scores (Banchs et al., 2015).</abstract> <identifier type="citekey">thu-etal-2021-hybrid</identifier> <identifier type="doi">10.18653/v1/2021.wat-1.7</identifier> <location> <url>https://aclanthology.org/2021.wat-1.7</url> </location> <part> <date>2021-08</date> <extent unit="page"> <start>83</start> <end>89</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021 %A Thu, Ye Kyaw %A Oo, Thazin Myint %A Nwe, Hlaing Myat %A Mon, Khaing Zar %A Kyaw, Nang Aeindray %A Phyo, Naing Linn %A Khun, Nann Hwan %A Thant, Hnin Aye %Y Nakazawa, Toshiaki %Y Nakayama, Hideki %Y Goto, Isao %Y Mino, Hideya %Y Ding, Chenchen %Y Dabre, Raj %Y Kunchukuttan, Anoop %Y Higashiyama, Shohei %Y Manabe, Hiroshi %Y Pa, Win Pa %Y Parida, Shantipriya %Y Bojar, Ondřej %Y Chu, Chenhui %Y Eriguchi, Akiko %Y Abe, Kaori %Y Oda, Yusuke %Y Sudoh, Katsuhito %Y Kurohashi, Sadao %Y Bhattacharyya, Pushpak %S Proceedings of the 8th Workshop on Asian Translation (WAT2021) %D 2021 %8 August %I Association for Computational Linguistics %C Online %F thu-etal-2021-hybrid %X In this paper we describe our submissions to WAT-2021 (Nakazawa et al., 2021) for English-to-Myanmar language (Burmese) task. Our team, ID: “YCC-MT1”, focused on bringing transliteration knowledge to the decoder without changing the model. We manually extracted the transliteration word/phrase pairs from the ALT corpus and applying XML markup feature of Moses decoder (i.e. -xml-input exclusive, -xml-input inclusive). We demonstrate that hybrid translation technique can significantly improve (around 6 BLEU scores) the baseline of three well-known “Phrase-based SMT”, “Operation Sequence Model” and “Hierarchical Phrase-based SMT”. Moreover, this simple hybrid method achieved the second highest results among the submitted MT systems for English-to-Myanmar WAT2021 translation share task according to BLEU (Papineni et al., 2002) and AMFM scores (Banchs et al., 2015). %R 10.18653/v1/2021.wat-1.7 %U https://aclanthology.org/2021.wat-1.7 %U https://doi.org/10.18653/v1/2021.wat-1.7 %P 83-89
Markdown (Informal)
[Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021](https://aclanthology.org/2021.wat-1.7) (Thu et al., WAT 2021)
- Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021 (Thu et al., WAT 2021)
ACL
- Ye Kyaw Thu, Thazin Myint Oo, Hlaing Myat Nwe, Khaing Zar Mon, Nang Aeindray Kyaw, Naing Linn Phyo, Nann Hwan Khun, and Hnin Aye Thant. 2021. Hybrid Statistical Machine Translation for English-Myanmar: UTYCC Submission to WAT-2021. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 83–89, Online. Association for Computational Linguistics.