Luan-Nghia Pham

Also published as: Luan Nghia Pham


2022

pdf bib
KC4MT: A High-Quality Corpus for Multilingual Machine Translation
Vinh Van Nguyen | Ha Nguyen | Huong Thanh Le | Thai Phuong Nguyen | Tan Van Bui | Luan Nghia Pham | Anh Tuan Phan | Cong Hoang-Minh Nguyen | Viet Hong Tran | Anh Huu Tran
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The multilingual parallel corpus is an important resource for many applications of natural language processing (NLP). For machine translation, the size and quality of the training corpus mainly affects the quality of the translation models. In this work, we present the method for building high-quality multilingual parallel corpus in the news domain and for some low-resource languages, including Vietnamese, Laos, and Khmer, to improve the quality of multilingual machine translation in these areas. We also publicized this one that includes 500.000 Vietnamese-Chinese bilingual sentence pairs; 150.000 Vietnamese-Laos bilingual sentence pairs, and 150.000 Vietnamese-Khmer bilingual sentence pairs.

2013

pdf bib
Vietnamese Text Accent Restoration with Statistical Machine Translation
Luan-Nghia Pham | Viet-Hong Tran | Vinh-Van Nguyen
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)