Tree-shape Uncertainty for Analyzing the Inherent Branching Bias of Unsupervised Parsing Models

Taiga Ishii, Yusuke Miyao


Abstract
This paper presents the formalization of tree-shape uncertainty that enables us to analyze the inherent branching bias of unsupervised parsing models using raw texts alone. Previous work analyzed the branching bias of unsupervised parsing models by comparing the outputs of trained parsers with gold syntactic trees. However, such approaches do not consider the fact that texts can be generated by different grammars with different syntactic trees, possibly failing to clearly separate the inherent bias of the model and the bias in train data learned by the model. To this end, we formulate tree-shape uncertainty and derive sufficient conditions that can be used for creating texts that are expected to contain no biased information on branching. In the experiment, we show that training parsers on such unbiased texts can effectively detect the branching bias of existing unsupervised parsing models. Such bias may depend only on the algorithm, or it may depend on seemingly unrelated dataset statistics such as sequence length and vocabulary size.
Anthology ID:
2023.conll-1.36
Volume:
Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)
Month:
December
Year:
2023
Address:
Singapore
Editors:
Jing Jiang, David Reitter, Shumin Deng
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
532–547
Language:
URL:
https://aclanthology.org/2023.conll-1.36
DOI:
10.18653/v1/2023.conll-1.36
Bibkey:
Cite (ACL):
Taiga Ishii and Yusuke Miyao. 2023. Tree-shape Uncertainty for Analyzing the Inherent Branching Bias of Unsupervised Parsing Models. In Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL), pages 532–547, Singapore. Association for Computational Linguistics.
Cite (Informal):
Tree-shape Uncertainty for Analyzing the Inherent Branching Bias of Unsupervised Parsing Models (Ishii & Miyao, CoNLL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.conll-1.36.pdf