Constructing an Annotated Corpus of Verbal MWEs for English

Abigail Walsh, Claire Bonial, Kristina Geeraert, John P. McCrae, Nathan Schneider, Clarissa Somers


Abstract
This paper describes the construction and annotation of a corpus of verbal MWEs for English, as part of the PARSEME Shared Task 1.1 on automatic identification of verbal MWEs. The criteria for corpus selection, the categories of MWEs used, and the training process are discussed, along with the particular issues that led to revisions in edition 1.1 of the annotation guidelines. Finally, an overview of the characteristics of the final annotated corpus is presented, as well as some discussion on inter-annotator agreement.
Anthology ID:
W18-4921
Volume:
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Agata Savary, Carlos Ramisch, Jena D. Hwang, Nathan Schneider, Melanie Andresen, Sameer Pradhan, Miriam R. L. Petruck
Venues:
LAW | MWE
SIGs:
SIGLEX | SIGANN
Publisher:
Association for Computational Linguistics
Note:
Pages:
193–200
Language:
URL:
https://aclanthology.org/W18-4921
DOI:
Bibkey:
Cite (ACL):
Abigail Walsh, Claire Bonial, Kristina Geeraert, John P. McCrae, Nathan Schneider, and Clarissa Somers. 2018. Constructing an Annotated Corpus of Verbal MWEs for English. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 193–200, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Constructing an Annotated Corpus of Verbal MWEs for English (Walsh et al., LAW-MWE 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4921.pdf
Data
English Web TreebankSTREUSLE