Word Level Language Identification in Code-mixed Kannada-English Texts using traditional machine learning algorithms

M. Shahiki Tash, Z. Ahani, A.l. Tonja, M. Gemeda, N. Hussain, O. Kolesnikova


Abstract
Language Identification at the Word Level in Kannada-English Texts. This paper de- scribes the system paper of CoLI-Kanglish 2022 shared task. The goal of this task is to identify the different languages used in CoLI- Kanglish 2022. This dataset is distributed into different categories including Kannada, En- glish, Mixed-Language, Location, Name, and Others. This Code-Mix was compiled by CoLI- Kanglish 2022 organizers from posts on social media. We use two classification techniques, KNN and SVM, and achieve an F1-score of 0.58 and place third out of nine competitors.
Anthology ID:
2022.icon-wlli.5
Volume:
Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts
Month:
December
Year:
2022
Address:
IIIT Delhi, New Delhi, India
Editors:
Bharathi Raja Chakravarthi, Abirami Murugappan, Dhivya Chinnappa, Adeep Hane, Prasanna Kumar Kumeresan, Rahul Ponnusamy
Venue:
ICON
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25–28
Language:
URL:
https://aclanthology.org/2022.icon-wlli.5
DOI:
Bibkey:
Cite (ACL):
M. Shahiki Tash, Z. Ahani, A.l. Tonja, M. Gemeda, N. Hussain, and O. Kolesnikova. 2022. Word Level Language Identification in Code-mixed Kannada-English Texts using traditional machine learning algorithms. In Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts, pages 25–28, IIIT Delhi, New Delhi, India. Association for Computational Linguistics.
Cite (Informal):
Word Level Language Identification in Code-mixed Kannada-English Texts using traditional machine learning algorithms (Shahiki Tash et al., ICON 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.icon-wlli.5.pdf