Quality Analysis of Patent Parallel Corpus by the Scale
Isamu Okada | Shinichiro Miyazawa | Kazunari Ishida | Nobuhiko Shimizu | Toshizumi Ohta
Workshop on patent translation
Large-scale parallel corpus is extremely important for translation memory, example-based machine translation, and the support system to create English sentences. Organized collection or establishment of large-scale corpus is currently ongoing; however it is a difficult project in terms of copyrights as well as economic efficiency. To investigate general tendency of large-scale corpus helps to improve economical efficiency of parallel corpus collection as well as system establishment. In this study, therefore, the relationship between the scale of parallel corpus and the degree of correspondence is clarified, using parallel corpus for patents.