Mumpitz at PARSEME Shared Task 2018: A Bidirectional LSTM for the Identification of Verbal Multiword Expressions

Rafael Ehren, Timm Lichte, Younes Samih


Abstract
In this paper, we describe Mumpitz, the system we submitted to the PARSEME Shared task on automatic identification of verbal multiword expressions (VMWEs). Mumpitz consists of a Bidirectional Recurrent Neural Network (BRNN) with Long Short-Term Memory (LSTM) units and a heuristic that leverages the dependency information provided in the PARSEME corpus data to differentiate VMWEs in a sentence. We submitted results for seven languages in the closed track of the task and for one language in the open track. For the open track we used the same system, but with pretrained instead of randomly initialized word embeddings to improve the system performance.
Anthology ID:
W18-4929
Volume:
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
LAW
SIGs:
SIGLEX | SIGANN
Publisher:
Association for Computational Linguistics
Note:
Pages:
261–267
Language:
URL:
https://aclanthology.org/W18-4929
DOI:
Bibkey:
Cite (ACL):
Rafael Ehren, Timm Lichte, and Younes Samih. 2018. Mumpitz at PARSEME Shared Task 2018: A Bidirectional LSTM for the Identification of Verbal Multiword Expressions. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 261–267, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Mumpitz at PARSEME Shared Task 2018: A Bidirectional LSTM for the Identification of Verbal Multiword Expressions (Ehren et al., LAW 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4929.pdf