Pushpak Bhattacharyyaa


2016

pdf bib
Detection of Compound Nouns and Light Verb Constructions using IndoWordNet
Dhirendra Singh | Sudha Bhingardive | Pushpak Bhattacharyyaa
Proceedings of the 8th Global WordNet Conference (GWC)

Detection of MultiWord Expressions (MWEs) is one of the fundamental problems in Natural Language Processing. In this paper, we focus on two categories of MWEs - Compound Nouns and Light Verb Constructions. These two categories can be tackled using knowledge bases, rather than pure statistics. We investigate usability of IndoWordNet for the detection of MWEs. Our IndoWordNet based approach uses semantic and ontological features of words that can be extracted from IndoWordNet. This approach has been tested on Indian languages viz., Assamese, Bengali, Hindi, Konkani, Marathi, Odia and Punjabi. Results show that ontological features are found to be very useful for the detection of light verb constructions, while use of semantic properties for the detection of compound nouns is found to be satisfactory. This approach can be easily adapted by other Indian languages. Detected MWEs can be interpolated into WordNets as they help in representing semantic knowledge.