A Bayesian model for joint word alignment and part-of-speech transfer

Robert Östling


Abstract
Current methods for word alignment require considerable amounts of parallel text to deliver accurate results, a requirement which is met only for a small minority of the world’s approximately 7,000 languages. We show that by jointly performing word alignment and annotation transfer in a novel Bayesian model, alignment accuracy can be improved for language pairs where annotations are available for only one of the languages—a finding which could facilitate the study and processing of a vast number of low-resource languages. We also present an evaluation where our method is used to perform single-source and multi-source part-of-speech transfer with 22 translations of the same text in four different languages. This allows us to quantify the considerable variation in accuracy depending on the specific source text(s) used, even with different translations into the same language.
Anthology ID:
C16-1060
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
620–629
Language:
URL:
https://aclanthology.org/C16-1060
DOI:
Bibkey:
Cite (ACL):
Robert Östling. 2016. A Bayesian model for joint word alignment and part-of-speech transfer. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 620–629, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
A Bayesian model for joint word alignment and part-of-speech transfer (Östling, COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1060.pdf