Marek Medveď


pdf bib
European Union Language Resources in Sketch Engine
Vít Baisa | Jan Michelfeit | Marek Medveď | Miloš Jakubíček
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the corpus manager Sketch Engine. A completely new resource is introduced: EUR-Lex Corpus, being one of the largest parallel corpus available at the moment, containing 840 million English tokens and the largest language pair English-French has more than 25 million aligned segments (paragraphs).

pdf bib
English-French Document Alignment Based on Keywords and Statistical Translation
Marek Medveď | Miloš Jakubíček | Vojtech Kovář
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers


pdf bib
Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
Vít Baisa | Aleš Horák | Marek Medveď
Proceedings of the Workshop Natural Language Processing for Translation Memories