Cooperating Tools for MWE Lexicon Management and Corpus Annotation

Yuji Matsumoto, Akihiko Kato, Hiroyuki Shindo, Toshio Morita


Abstract
We present tools for lexicon and corpus management that offer cooperating functionality in corpus annotation. The former, named Cradle, stores a set of words and expressions where multi-word expressions are defined with their own part-of-speech information and internal syntactic structures. The latter, named ChaKi, manages text corpora with part-of-speech (POS) and syntactic dependency structure annotations. Those two tools cooperate so that the words and multi-word expressions stored in Cradle are directly referred to by ChaKi in conducting corpus annotation, and the words and expressions annotated in ChaKi can be output as a list of lexical entities that are to be stored in Cradle.
Anthology ID:
W18-4922
Volume:
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Agata Savary, Carlos Ramisch, Jena D. Hwang, Nathan Schneider, Melanie Andresen, Sameer Pradhan, Miriam R. L. Petruck
Venues:
LAW | MWE
SIGs:
SIGANN | SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
201–206
Language:
URL:
https://aclanthology.org/W18-4922
DOI:
Bibkey:
Cite (ACL):
Yuji Matsumoto, Akihiko Kato, Hiroyuki Shindo, and Toshio Morita. 2018. Cooperating Tools for MWE Lexicon Management and Corpus Annotation. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 201–206, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Cooperating Tools for MWE Lexicon Management and Corpus Annotation (Matsumoto et al., LAW-MWE 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4922.pdf
Data
Universal Dependencies