On Parsing as Tagging

Afra Amini, Ryan Cotterell


Abstract
There are many proposals to reduce constituency parsing to tagging. To figure out what these approaches have in common, we offer a unifying pipeline, which consists of three steps: linearization, learning, and decoding. We prove that classic shift–reduce parsing can be reduced to tetratagging—the state-of-the-art constituency tagger—under two assumptions: right-corner transformation in the linearization step and factored scoring in the learning step. We ask what is the most critical factor that makes parsing-as-tagging methods accurate while being efficient. To answer this question, we empirically evaluate a taxonomy of tagging pipelines with different choices of linearizers, learners, and decoders. Based on the results in English as well as a set of 8 typologically diverse languages, we conclude that the linearization of the derivation tree and its alignment with the input sequence is the most critical factor in achieving accurate parsers as taggers.
Anthology ID:
2022.emnlp-main.607
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8884–8900
Language:
URL:
https://aclanthology.org/2022.emnlp-main.607
DOI:
10.18653/v1/2022.emnlp-main.607
Bibkey:
Cite (ACL):
Afra Amini and Ryan Cotterell. 2022. On Parsing as Tagging. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8884–8900, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
On Parsing as Tagging (Amini & Cotterell, EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.607.pdf