Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

Taeuk Kim


Abstract
Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. While attractive in the perspective that similar to in-context learning, it does not require task-specific fine-tuning, the practical effectiveness of such an approach still remains unclear, except that it can function as a probe for investigating language models’ inner workings. In this work, we mathematically reformulate CPE-PLM and propose two advanced ensemble methods tailored for it, demonstrating that the new parsing paradigm can be competitive with common unsupervised parsers by introducing a set of heterogeneous PLMs combined using our techniques. Furthermore, we explore some scenarios where the trees generated by CPE-PLM are practically useful. Specifically, we show that CPE-PLM is more effective than typical supervised parsers in few-shot settings.
Anthology ID:
2022.coling-1.479
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5398–5408
Language:
URL:
https://aclanthology.org/2022.coling-1.479
DOI:
Bibkey:
Cite (ACL):
Taeuk Kim. 2022. Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5398–5408, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models (Kim, COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.479.pdf
Data
Penn TreebankSST