Enriching Source for English-to-Urdu Machine Translation

Bushra Jawaid, Amir Kamran, Ondřej Bojar


Abstract
This paper focuses on the generation of case markers for free word order languages that use case markers as phrasal clitics for marking the relationship between the dependent-noun and its head. The generation of such clitics becomes essential task especially when translating from fixed word order languages where syntactic relations are identified by the positions of the dependent-nouns. To address the problem of missing markers on source-side, artificial markers are added in source to improve alignments with its target counterparts. Up to 1 BLEU point increase is observed over the baseline on different test sets for English-to-Urdu.
Anthology ID:
W16-3706
Volume:
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
WSSANLP
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
54–63
Language:
URL:
https://aclanthology.org/W16-3706
DOI:
Bibkey:
Cite (ACL):
Bushra Jawaid, Amir Kamran, and Ondřej Bojar. 2016. Enriching Source for English-to-Urdu Machine Translation. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), pages 54–63, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Enriching Source for English-to-Urdu Machine Translation (Jawaid et al., WSSANLP 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-3706.pdf