Dursun Dashdamirov
2024
Findings of the 2nd Shared Task on Multi-lingual Multi-task Information Retrieval at MRL 2024
Francesco Tinner
|
Raghav Mantri
|
Mammad Hajili
|
Chiamaka Chukwuneke
|
Dylan Massey
|
Benjamin A. Ajibade
|
Bilge Deniz Kocak
|
Abolade Dawud
|
Jonathan Atala
|
Hale Sirin
|
Kayode Olaleye
|
Anar Rzayev
|
Jafar Isbarov
|
Dursun Dashdamirov
|
David Adelani
|
Duygu Ataman
Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
Large language models (LLMs) demonstrate exceptional proficiency in both the comprehension and generation of textual data, particularly in English, a language for which extensive public benchmarks have been established across a wide range of natural language processing (NLP) tasks. Nonetheless, their performance in multilingual contexts and specialized domains remains less rigorously validated, raising questions about their reliability and generalizability across linguistically diverse and domain-specific settings. The second edition of the Shared Task on Multilingual Multitask Information Retrieval aims to provide a comprehensive and inclusive multilingual evaluation benchmark which aids assessing the ability of multilingual LLMs to capture logical, factual, or causal relationships within lengthy text contexts and generate language under sparse settings, particularly in scenarios with under-resourced languages. The shared task consists of two subtasks crucial to information retrieval: Named entity recognition (NER) and reading comprehension (RC), in 7 data-scarce languages: Azerbaijani, Swiss German, Turkish and , which previously lacked annotated resources in information retrieval tasks. This year specifally focus on the multiple-choice question answering evaluation setting which provides a more objective setting for comparing different methods across languages.