English-Indonesian Neural Machine Translation for Spoken Language Domains

Meisyarah Dwiastuti


Abstract
In this work, we conduct a study on Neural Machine Translation (NMT) for English-Indonesian (EN-ID) and Indonesian-English (ID-EN). We focus on spoken language domains, namely colloquial and speech languages. We build NMT systems using the Transformer model for both translation directions and implement domain adaptation, in which we train our pre-trained NMT systems on speech language (in-domain) data. Moreover, we conduct an evaluation on how the domain-adaptation method in our EN-ID system can result in more formal translation outputs.
Anthology ID:
P19-2043
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Fernando Alva-Manchego, Eunsol Choi, Daniel Khashabi
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
309–314
Language:
URL:
https://aclanthology.org/P19-2043
DOI:
10.18653/v1/P19-2043
Bibkey:
Cite (ACL):
Meisyarah Dwiastuti. 2019. English-Indonesian Neural Machine Translation for Spoken Language Domains. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 309–314, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
English-Indonesian Neural Machine Translation for Spoken Language Domains (Dwiastuti, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-2043.pdf