NLP_Team1@SSN at SemEval-2024 Task 1: Impact of language models in Sentence-BERT for Semantic Textual Relatedness in Low-resource Languages

Senthil Kumar, Aravindan Chandrabose, Gokulakrishnan B, Karthikraja Tp


Abstract
Semantic Textual Relatedness (STR) will provide insight into the limitations of existing models and support ongoing work on semantic representations. Track A in Shared Task-1, provides pairs of sentences with semantic relatedness scores for 9 languages out of which 7 are low-resources. These languages are from four different language families. We developed models for 8 languages (except for Amharic) in Track A, using Sentence Transformers (SBERT) architecture, and fine-tuned them with multilingual and monolingual pre-trained language models (PLM). Our models for English (eng), Algerian Arabic (arq), andKinyarwanda (kin) languages were ranked 12, 5, and 8 respectively. Our submissions are ranked 5th among 40 submissions in Track A with an average Spearman correlation score of 0.74. However, we observed that the usage of monolingual PLMs did not guarantee better than multilingual PLMs in Marathi (mar), and Telugu (tel) languages in our case.
Anthology ID:
2024.semeval-1.260
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1854–1859
Language:
URL:
https://aclanthology.org/2024.semeval-1.260
DOI:
Bibkey:
Cite (ACL):
Senthil Kumar, Aravindan Chandrabose, Gokulakrishnan B, and Karthikraja Tp. 2024. NLP_Team1@SSN at SemEval-2024 Task 1: Impact of language models in Sentence-BERT for Semantic Textual Relatedness in Low-resource Languages. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1854–1859, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
NLP_Team1@SSN at SemEval-2024 Task 1: Impact of language models in Sentence-BERT for Semantic Textual Relatedness in Low-resource Languages (Kumar et al., SemEval 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.semeval-1.260.pdf
Supplementary material:
 2024.semeval-1.260.SupplementaryMaterial.txt