Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation
King Chan | Julian Brooke | Timothy Baldwin
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
This paper presents a methodology for identifying and resolving various kinds of inconsistency in the context of merging dependency and multiword expression (MWE) annotations, to generate a dependency treebank with comprehensive MWE annotations. Candidates for correction are identified using a variety of heuristics, including an entirely novel one which identifies violations of MWE constituency in the dependency tree, and resolved by arbitration with minimal human intervention. Using this technique, we identified and corrected several hundred errors across both parse and MWE annotations, representing changes to a significant percentage (well over 10%) of the MWE instances in the joint corpus.