The Open Cantonese Sense-Tagged Corpus

Joanna Sio, Luis Morgado Da Costa


Abstract
This paper introduces the Open Cantonese Sense-Tagged Corpus, a new and ongoing project to serve as the companion to the development of the Cantonese Wordnet. This corpus is built on top of the Cantonese Wordnet Corpus, which currently provides example sentences for most verbs in this wordnet. This paper motivates the choice of starting a sense-tagged corpus from both linguistic and educational perspectives, and discusses the current solutions to issues arisen from the sense-tagging exercise. In total, we have tagged over 5,000 concepts, with more than 3,700 direct links to the Cantonese Wordnet.
Anthology ID:
2023.gwc-1.32
Volume:
Proceedings of the 12th Global Wordnet Conference
Month:
January
Year:
2023
Address:
University of the Basque Country, Donostia - San Sebastian, Basque Country
Editors:
German Rigau, Francis Bond, Alexandre Rademaker
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
263–268
Language:
URL:
https://aclanthology.org/2023.gwc-1.32
DOI:
Bibkey:
Cite (ACL):
Joanna Sio and Luis Morgado Da Costa. 2023. The Open Cantonese Sense-Tagged Corpus. In Proceedings of the 12th Global Wordnet Conference, pages 263–268, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
Cite (Informal):
The Open Cantonese Sense-Tagged Corpus (Sio & Costa, GWC 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.gwc-1.32.pdf