Semi-automatic compilation of bilingual lexcion entries from cross-lingually relevant news articles on WWW news sites

Takehito Utsuro, Takashi Horiuchi, Yasunobu Chiba, Takeshi Hamamoto


Abstract
For the purpose of overcoming resource scarcity bottleneck in corpus-based translation knowledge acquisition research, this paper takes an approach of semi-automatically acquiring domain specific translation knowledge from the collection of bilingual news articles on WWW news sites. This paper presents results of applying standard co-occurrence frequency based techniques of estimating bilingual term correspondences from parallel corpora to relevant article pairs automatically collected from WWW news sites. The experimental evaluation results are very encouraging and it is proved that many useful bilingual term correspondences can be efficiently discovered with little human intervention from relevant article pairs on WWW news sites.
Anthology ID:
2002.amta-papers.17
Volume:
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
October 8-12
Year:
2002
Address:
Tiburon, USA
Editor:
Stephen D. Richardson
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
165–176
Language:
URL:
https://link.springer.com/chapter/10.1007/3-540-45820-4_17
DOI:
Bibkey:
Cite (ACL):
Takehito Utsuro, Takashi Horiuchi, Yasunobu Chiba, and Takeshi Hamamoto. 2002. Semi-automatic compilation of bilingual lexcion entries from cross-lingually relevant news articles on WWW news sites. In Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 165–176, Tiburon, USA. Springer.
Cite (Informal):
Semi-automatic compilation of bilingual lexcion entries from cross-lingually relevant news articles on WWW news sites (Utsuro et al., AMTA 2002)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/3-540-45820-4_17