Efficient Constituency Parsing by Pointing

Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li


Abstract
We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks. Specifically, our model estimates the likelihood of a span being a legitimate tree constituent via the pointing score corresponding to the boundary words of the span. Our parsing model supports efficient top-down decoding and our learning objective is able to enforce structural consistency without resorting to the expensive CKY inference. The experiments on the standard English Penn Treebank parsing task show that our method achieves 92.78 F1 without using pre-trained models, which is higher than all the existing methods with similar time complexity. Using pre-trained BERT, our model achieves 95.48 F1, which is competitive with the state-of-the-art while being faster. Our approach also establishes new state-of-the-art in Basque and Swedish in the SPMRL shared tasks on multilingual constituency parsing.
Anthology ID:
2020.acl-main.301
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3284–3294
Language:
URL:
https://aclanthology.org/2020.acl-main.301
DOI:
10.18653/v1/2020.acl-main.301
Bibkey:
Cite (ACL):
Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, and Xiaoli Li. 2020. Efficient Constituency Parsing by Pointing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3284–3294, Online. Association for Computational Linguistics.
Cite (Informal):
Efficient Constituency Parsing by Pointing (Nguyen et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.301.pdf
Video:
 http://slideslive.com/38928813
Data
Penn Treebank