Unsupervised Tree Induction for Tree-based Translation

Feifei Zhai, Jiajun Zhang, Yu Zhou, Chengqing Zong


Abstract
In current research, most tree-based translation models are built directly from parse trees. In this study, we go in another direction and build a translation model with an unsupervised tree structure derived from a novel non-parametric Bayesian model. In the model, we utilize synchronous tree substitution grammars (STSG) to capture the bilingual mapping between language pairs. To train the model efficiently, we develop a Gibbs sampler with three novel Gibbs operators. The sampler is capable of exploring the infinite space of tree structures by performing local changes on the tree nodes. Experimental results show that the string-to-tree translation system using our Bayesian tree structures significantly outperforms the strong baseline string-to-tree system using parse trees.
Anthology ID:
Q13-1020
Volume:
Transactions of the Association for Computational Linguistics, Volume 1
Month:
Year:
2013
Address:
Cambridge, MA
Editors:
Dekang Lin, Michael Collins
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
243–254
Language:
URL:
https://aclanthology.org/Q13-1020
DOI:
10.1162/tacl_a_00224
Bibkey:
Cite (ACL):
Feifei Zhai, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2013. Unsupervised Tree Induction for Tree-based Translation. Transactions of the Association for Computational Linguistics, 1:243–254.
Cite (Informal):
Unsupervised Tree Induction for Tree-based Translation (Zhai et al., TACL 2013)
Copy Citation:
PDF:
https://aclanthology.org/Q13-1020.pdf