Towards the Automatic Identification of Light Verb Constructions in Bulgarian

Ivelina Stoyanova; Svetlozara Leseva; Maria Todorova

Towards the Automatic Identification of Light Verb Constructions in Bulgarian

Ivelina Stoyanova, Svetlozara Leseva, Maria Todorova

Abstract

This paper presents work in progress focused on developing a method for automatic identification of light verb constructions (LVCs) as a subclass of Bulgarian verbal MWEs. The method is based on machine learning and is trained on a set of LVCs extracted from the Bulgarian WordNet (BulNet) and the Bulgarian National Corpus (BulNC). The machine learning uses lexical, morphosyntactic, syntactic and semantic features of LVCs. We trained and tested two separate classifiers using the Java package Weka and two learning decision tree algorithms – J48 and RandomTree. The evaluation of the method includes 10-fold cross-validation on the training data from BulNet (F1 = 0.766 obtained by the J48 decision tree algorithm and F1 = 0.725 by the RandomTree algorithm), as well as evaluation of the performance on new instances from the BulNC (F1 = 0.802 by J48 and F1 = 0.607 by the RandomTree algorithm). Preliminary filtering of the candidates gives a slight improvement (F1 = 0.802 by J48 and F1 = 0.737 by RandomTree).

Anthology ID:: 2016.clib-1.4
Volume:: Proceedings of the Second International Conference on Computational Linguistics in Bulgaria (CLIB 2016)
Month:: September
Year:: 2016
Address:: Sofia, Bulgaria
Venue:: CLIB
SIG:
Publisher:: Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences
Note:
Pages:: 28–37
Language:
URL:: https://aclanthology.org/2016.clib-1.4/
DOI:
Bibkey:
Cite (ACL):: Ivelina Stoyanova, Svetlozara Leseva, and Maria Todorova. 2016. Towards the Automatic Identification of Light Verb Constructions in Bulgarian. In Proceedings of the Second International Conference on Computational Linguistics in Bulgaria (CLIB 2016), pages 28–37, Sofia, Bulgaria. Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences.
Cite (Informal):: Towards the Automatic Identification of Light Verb Constructions in Bulgarian (Stoyanova et al., CLIB 2016)
Copy Citation:
PDF:: https://aclanthology.org/2016.clib-1.4.pdf

PDF Cite Search Fix data