Sami Ul Haq

Also published as: Sami Ul Haq

2021

pdf bib abs
FJWU Participation for the WMT21 Biomedical Translation Task
Sumbal Naz | Sadaf Abdul Rauf | Sami Ul Haq
Proceedings of the Sixth Conference on Machine Translation

In this paper we present the FJWU’s system submitted to the biomedical shared task at WMT21. We prepared state-of-the-art multilingual neural machine translation systems for three languages (i.e. German, Spanish and French) with English as target language. Our NMT systems based on Transformer architecture, were trained on combination of in-domain and out-domain parallel corpora developed using Information Retrieval (IR) and domain adaptation techniques.

2020

pdf bib abs
Document Level NMT of Low-Resource Languages with Backtranslation
Sami Ul Haq | Sadaf Abdul Rauf | Arsalan Shaukat | Abdullah Saeed
Proceedings of the Fifth Conference on Machine Translation

This paper describes our system submission to WMT20 shared task on similar language translation. We examined the use of documentlevel neural machine translation (NMT) systems for low-resource, similar language pair Marathi−Hindi. Our system is an extension of state-of-the-art Transformer architecture with hierarchical attention networks to incorporate contextual information. Since, NMT requires large amount of parallel data which is not available for this task, our approach is focused on utilizing monolingual data with back translation to train our models. Our experiments reveal that document-level NMT can be a reasonable alternative to sentence-level NMT for improving translation quality of low resourced languages even when used with synthetic data.

pdf bib abs
FJWU participation for the WMT20 Biomedical Translation Task
Sumbal Naz | Sadaf Abdul Rauf | Noor-e- Hira | Sami Ul Haq
Proceedings of the Fifth Conference on Machine Translation

This paper reports system descriptions for FJWU-NRPU team for participation in the WMT20 Biomedical shared translation task. We focused our submission on exploring the effects of adding in-domain corpora extracted from various out-of-domain sources. Systems were built for French to English using in-domain corpora through fine tuning and selective data training. We further explored BERT based models specifically with focus on effect of domain adaptive subword units.

pdf bib abs
Improving Document-Level Neural Machine Translation with Domain Adaptation
Sami Ul Haq | Sadaf Abdul Rauf | Arslan Shoukat | Noor-e- Hira
Proceedings of the Fourth Workshop on Neural Generation and Translation

Recent studies have shown that translation quality of NMT systems can be improved by providing document-level contextual information. In general sentence-based NMT models are extended to capture contextual information from large-scale document-level corpora which are difficult to acquire. Domain adaptation on the other hand promises adapting components of already developed systems by exploiting limited in-domain data. This paper presents FJWU’s system submission at WNGT, we specifically participated in Document level MT task for German-English translation. Our system is based on context-aware Transformer model developed on top of original NMT architecture by integrating contextual information using attention networks. Our experimental results show providing previous sentences as context significantly improves the BLEU score as compared to a strong NMT baseline. We also studied the impact of domain adaptation on document level translationand were able to improve results by adaptingthe systems according to the testing domain.

Co-authors

Arslan Shoukat 1

Venues

wmt3
ngt1