Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations.

Marcos Garcia, Marcos García Salido, Susana Sotelo, Estela Mosqueira, Margarita Alonso-Ramos


Abstract
This paper presents a new multilingual corpus with semantic annotation of collocations in English, Portuguese, and Spanish. The whole resource contains 155k tokens and 1,526 collocations labeled in context. The annotated examples belong to three syntactic relations (adjective-noun, verb-object, and nominal compounds), and represent 58 lexical functions in the Meaning-Text Theory (e.g., Oper, Magn, Bon, etc.). Each collocation was annotated by three linguists and the final resource was revised by a team of experts. The resulting corpus can serve as a basis to evaluate different approaches for collocation identification, which in turn can be useful for different NLP tasks such as natural language understanding or natural language generation.
Anthology ID:
P19-1392
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4012–4019
Language:
URL:
https://aclanthology.org/P19-1392
DOI:
10.18653/v1/P19-1392
Bibkey:
Cite (ACL):
Marcos Garcia, Marcos García Salido, Susana Sotelo, Estela Mosqueira, and Margarita Alonso-Ramos. 2019. Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations.. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4012–4019, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations. (Garcia et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1392.pdf