Classification of Multiword Expressions in Malayalam

Treesa Cyriac, Sobha Lalitha Devi


Abstract
Multiword expression is an interesting concept in languages and the MWEs of a language are not easy for a non-native speaker to understand. It includes lexicalized phrases, idioms, collocations etc. Data on multiwords are helpful in language processing. ‘Multiword expressions in Malayalam’ is a less studied area. In this paper, we are trying to explore multiwords in Malayalam and to classify them as per the three idiosyncrasies: semantic idiosyncrasy, syntactic idiosyncrasy, and statistic idiosyncrasy. Though these are already identified, they are not being studied in Malayalam. The classification and features are given and are studied using Malayalam multiwords. Through this study, we identified how the linguistic features of Malayalam such as agglutination influence its multiword expressions in terms of pronunciation and spelling. Malayalam has a set of code-mixed multiword expressions which is also addressed in this study.
Anthology ID:
2022.wildre-1.10
Volume:
Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Girish Nath Jha, Sobha L., Kalika Bali, Atul Kr. Ojha
Venue:
WILDRE
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
55–59
Language:
URL:
https://aclanthology.org/2022.wildre-1.10
DOI:
Bibkey:
Cite (ACL):
Treesa Cyriac and Sobha Lalitha Devi. 2022. Classification of Multiword Expressions in Malayalam. In Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference, pages 55–59, Marseille, France. European Language Resources Association.
Cite (Informal):
Classification of Multiword Expressions in Malayalam (Cyriac & Lalitha Devi, WILDRE 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wildre-1.10.pdf