Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing

Weiwei Sun, Xiaojun Wan


Abstract
We present a comparative study of transition-, graph- and PCFG-based models aimed at illuminating more precisely the likely contribution of CFGs in improving Chinese dependency parsing accuracy, especially by combining heterogeneous models. Inspired by the impact of a constituency grammar on dependency parsing, we propose several strategies to acquire pseudo CFGs only from dependency annotations. Compared to linguistic grammars learned from rich phrase-structure treebanks, well designed pseudo grammars achieve similar parsing accuracy and have equivalent contributions to parser ensemble. Moreover, pseudo grammars increase the diversity of base models; therefore, together with all other models, further improve system combination. Based on automatic POS tagging, our final model achieves a UAS of 87.23%, resulting in a significant improvement of the state of the art.
Anthology ID:
Q13-1025
Volume:
Transactions of the Association for Computational Linguistics, Volume 1
Month:
Year:
2013
Address:
Cambridge, MA
Editors:
Dekang Lin, Michael Collins
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
301–314
Language:
URL:
https://aclanthology.org/Q13-1025
DOI:
10.1162/tacl_a_00229
Bibkey:
Cite (ACL):
Weiwei Sun and Xiaojun Wan. 2013. Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing. Transactions of the Association for Computational Linguistics, 1:301–314.
Cite (Informal):
Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing (Sun & Wan, TACL 2013)
Copy Citation:
PDF:
https://aclanthology.org/Q13-1025.pdf