Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Student Research Workshop
English-Manipuri language pair is one of the rarely investigated with restricted bilingual resources. The development of a factored Statistical Machine Translation (SMT) system between English as source and Manipuri, a morphologically rich language as target is reported. The role of the suffixes and dependency relations on the source side and case markers on the target side are identified as important translation factors. The morphology and dependency relations play important roles to improve the translation quality. A parallel corpus of 10350 sentences from news domain is used for training and the system is tested with 500 sentences. Using the proposed translation factors, the output of the translation quality is improved as indicated by the BLEU score and subjective evaluation.
We introduce a novel translation rule that captures discontinuous, partial constituent, and non-projective phrases from source language. Using the traversal order sequences of the dependency tree, our proposed method 1) extracts the synchronous rules in linear time and 2) combines them efficiently using the CYK chart parsing algorithm. We analytically show the effectiveness of this translation rule in translating relatively free order sentences, and empirically investigate the coverage of our proposed method.
An implementation of a non-structural Example-Based Machine Translation system that translates sentences from Arabic to English, using a parallel corpus aligned at the sentence level, is described. Source-language synonyms were derived automatically and used to help locate potential translation examples for fragments of a given input sentence. The smaller the parallel corpus, the greater the contribution provided by synonyms. Considering the degree of relevance of the subject matter of a potential match contributes to the quality of the final results.
Hebrew and Arabic are related but mutually incomprehensible languages with complex morphology and scarce parallel corpora. Machine translation between the two languages is therefore interesting and challenging. We discuss similarities and differences between Hebrew and Arabic, the benefits and challenges that they induce, respectively, and their implications for machine translation. We highlight the shortcomings of using English as a pivot language and advocate a direct, transfer-based and linguistically-informed (but still statistical, and hence scalable) approach. We report preliminary results of such a system that we are currently developing.