Product Review Translation using Phrase Replacement and Attention Guided Noise Augmentation

Kamal Gupta, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal


Abstract
Product reviews provide valuable feedback of the customers and however and they are available today only in English on most of the e-commerce platforms. The nature of reviews provided by customers in any multilingual country poses unique challenges for machine translation such as code-mixing and ungrammatical sentences and presence of colloquial terms and lack of e-commerce parallel corpus etc. Given that 44% of Indian population speaks and operates in Hindi language and we address the above challenges by presenting an English–to–Hindi neural machine translation (NMT) system to translate the product reviews available on e-commerce websites by creating an in-domain parallel corpora and handling various types of noise in reviews via two data augmentation techniques and viz. (i). a novel phrase augmentation technique (PhrRep) where the syntactic noun phrases in sentences are replaced by the other noun phrases carrying different meanings but in similar context; and (ii). a novel attention guided noise augmentation (AttnNoise) technique to make our NMT model robust towards various noise. Evaluation shows that using the proposed augmentation techniques we achieve a 6.67 BLEU score improvement over the baseline model. In order to show that our proposed approach is not language-specific and we also perform experiments for two other language pairs and viz. En-Fr (MTNT18 corpus) and En-De (IWSLT17) that yield the improvements of 2.55 and 0.91 BLEU points and respectively and over the baselines.
Anthology ID:
2021.mtsummit-research.20
Volume:
Proceedings of Machine Translation Summit XVIII: Research Track
Month:
August
Year:
2021
Address:
Virtual
Editors:
Kevin Duh, Francisco Guzmán
Venue:
MTSummit
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
243–255
Language:
URL:
https://aclanthology.org/2021.mtsummit-research.20
DOI:
Bibkey:
Cite (ACL):
Kamal Gupta, Soumya Chennabasavaraj, Nikesh Garera, and Asif Ekbal. 2021. Product Review Translation using Phrase Replacement and Attention Guided Noise Augmentation. In Proceedings of Machine Translation Summit XVIII: Research Track, pages 243–255, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):
Product Review Translation using Phrase Replacement and Attention Guided Noise Augmentation (Gupta et al., MTSummit 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.mtsummit-research.20.pdf
Data
MTNT