Seyed Ali Farokh
2024
ALF at SemEval-2024 Task 9: Exploring Lateral Thinking Capabilities of LMs through Multi-task Fine-tuning
Seyed Ali Farokh
|
Hossein Zeinali
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Recent advancements in natural language processing (NLP) have prompted the development of sophisticated reasoning benchmarks. This paper presents our system for the SemEval 2024 Task 9 competition and also investigates the efficacy of fine-tuning language models (LMs) on BrainTeaser—a benchmark designed to evaluate NLP models’ lateral thinking and creative reasoning abilities. Our experiments focus on two prominent families of pre-trained models, BERT and T5. Additionally, we explore the potential benefits of multi-task fine-tuning on commonsense reasoning datasets to enhance performance. Our top-performing model, DeBERTa-v3-large, achieves an impressive overall accuracy of 93.33%, surpassing human performance.
Search