Learned Incremental Representations for Parsing

Nikita Kitaev, Thomas Lu, Dan Klein


Abstract
We present an incremental syntactic representation that consists of assigning a single discrete label to each word in a sentence, where the label is predicted using strictly incremental processing of a prefix of the sentence, and the sequence of labels for a sentence fully determines a parse tree. Our goal is to induce a syntactic representation that commits to syntactic choices only as they are incrementally revealed by the input, in contrast with standard representations that must make output choices such as attachments speculatively and later throw out conflicting analyses. Our learned representations achieve 93.72 F1 on the Penn Treebank with as few as 5 bits per word, and at 8 bits per word they achieve 94.97 F1, which is comparable with other state of the art parsing models when using the same pre-trained embeddings. We also provide an analysis of the representations learned by our system, investigating properties such as the interpretable syntactic features captured by the system and mechanisms for deferred resolution of syntactic ambiguities.
Anthology ID:
2022.acl-long.220
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3086–3095
Language:
URL:
https://aclanthology.org/2022.acl-long.220
DOI:
10.18653/v1/2022.acl-long.220
Award:
 Best Paper
Bibkey:
Cite (ACL):
Nikita Kitaev, Thomas Lu, and Dan Klein. 2022. Learned Incremental Representations for Parsing. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3086–3095, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Learned Incremental Representations for Parsing (Kitaev et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.220.pdf
Code
 thomaslu2000/incremental-parsing-representations
Data
Penn Treebank