Shubh Nisar
2024
Commentator: A Code-mixed Multilingual Text Annotation Framework
Rajvee Sheth
|
Shubh Nisar
|
Heenaben Prajapati
|
Himanshu Beniwal
|
Mayank Singh
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
As the NLP community increasingly addresses challenges associated with multilingualism, robust annotation tools are essential to handle multilingual datasets efficiently. In this paper, we introduce a code-mixed multilingual text annotation framework, COMMENTATOR, specifically designed for annotating code- mixed text. The tool demonstrates its effectiveness in token-level and sentence-level language annotation tasks for Hinglish text. We perform robust qualitative human-based evaluations to showcase COMMENTATOR led to 5x faster annotations than the best baseline.
Search