JU_NLP at HinglishEval: Quality Evaluation of the Low-Resource Code-Mixed Hinglish Text

Prantik Guha, Rudra Dhar, Dipankar Das


Abstract
In this paper we describe a system submitted to the INLG 2022 Generation Challenge (GenChal) on Quality Evaluation of the Low-Resource Synthetically Generated Code-Mixed Hinglish Text. We implement a Bi-LSTM-based neural network model to predict the Average rating score and Disagreement score of the synthetic Hinglish dataset. In our models, we used word embeddings for English and Hindi data, and one hot encodings for Hinglish data. We achieved a F1 score of 0.11, and mean squared error of 6.0 in the average rating score prediction task. In the task of Disagreement score prediction, we achieve a F1 score of 0.18, and mean squared error of 5.0.
Anthology ID:
2022.inlg-genchal.7
Volume:
Proceedings of the 15th International Conference on Natural Language Generation: Generation Challenges
Month:
July
Year:
2022
Address:
Waterville, Maine, USA and virtual meeting
Editors:
Samira Shaikh, Thiago Ferreira, Amanda Stent
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
39–42
Language:
URL:
https://aclanthology.org/2022.inlg-genchal.7
DOI:
Bibkey:
Cite (ACL):
Prantik Guha, Rudra Dhar, and Dipankar Das. 2022. JU_NLP at HinglishEval: Quality Evaluation of the Low-Resource Code-Mixed Hinglish Text. In Proceedings of the 15th International Conference on Natural Language Generation: Generation Challenges, pages 39–42, Waterville, Maine, USA and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
JU_NLP at HinglishEval: Quality Evaluation of the Low-Resource Code-Mixed Hinglish Text (Guha et al., INLG 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.inlg-genchal.7.pdf