Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle

Maximin Coavoux, Shay B. Cohen


Abstract
We introduce a novel transition system for discontinuous constituency parsing. Instead of storing subtrees in a stack –i.e. a data structure with linear-time sequential access– the proposed system uses a set of parsing items, with constant-time random access. This change makes it possible to construct any discontinuous constituency tree in exactly 4n–2 transitions for a sentence of length n. At each parsing step, the parser considers every item in the set to be combined with a focus item and to construct a new constituent in a bottom-up fashion. The parsing strategy is based on the assumption that most syntactic structures can be parsed incrementally and that the set –the memory of the parser– remains reasonably small on average. Moreover, we introduce a provably correct dynamic oracle for the new transition system, and present the first experiments in discontinuous constituency parsing using a dynamic oracle. Our parser obtains state-of-the-art results on three English and German discontinuous treebanks.
Anthology ID:
N19-1018
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
204–217
Language:
URL:
https://aclanthology.org/N19-1018
DOI:
10.18653/v1/N19-1018
Bibkey:
Cite (ACL):
Maximin Coavoux and Shay B. Cohen. 2019. Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 204–217, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle (Coavoux & Cohen, NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1018.pdf
Code
 mcoavoux/discoparset
Data
Penn Treebank