ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.
Amira Karoui, Farah Gharbi, Rami Kammoun, Imen Laouirine, Fethi Bougares
Abstract
This paper describes our submissions to the Multi-label Country-level Dialect Identification subtask of the NADI2024 shared task, organized during the second edition of the ArabicNLP conference. Our submission is based on the ensemble of fine-tuned BERT-based models, after implementing the Similarity-Induced Mono-to-Multi Label Transformation (SIMMT) on the input data. Our submission ranked first with a Macro-Average (MA) F1 score of 50.57%.- Anthology ID:
- 2024.arabicnlp-1.85
- Volume:
- Proceedings of The Second Arabic Natural Language Processing Conference
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Nizar Habash, Houda Bouamor, Ramy Eskander, Nadi Tomeh, Ibrahim Abu Farha, Ahmed Abdelali, Samia Touileb, Injy Hamed, Yaser Onaizan, Bashar Alhafni, Wissam Antoun, Salam Khalifa, Hatem Haddad, Imed Zitouni, Badr AlKhamissi, Rawan Almatham, Khalil Mrini
- Venues:
- ArabicNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 758–763
- Language:
- URL:
- https://aclanthology.org/2024.arabicnlp-1.85
- DOI:
- 10.18653/v1/2024.arabicnlp-1.85
- Bibkey:
- Cite (ACL):
- Amira Karoui, Farah Gharbi, Rami Kammoun, Imen Laouirine, and Fethi Bougares. 2024. ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.. In Proceedings of The Second Arabic Natural Language Processing Conference, pages 758–763, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation. (Karoui et al., ArabicNLP-WS 2024)
- Copy Citation:
- PDF:
- https://aclanthology.org/2024.arabicnlp-1.85.pdf
Export citation
@inproceedings{karoui-etal-2024-elyadata, title = "{ELYADATA} at {NADI} 2024 shared task: {A}rabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.", author = "Karoui, Amira and Gharbi, Farah and Kammoun, Rami and Laouirine, Imen and Bougares, Fethi", editor = "Habash, Nizar and Bouamor, Houda and Eskander, Ramy and Tomeh, Nadi and Abu Farha, Ibrahim and Abdelali, Ahmed and Touileb, Samia and Hamed, Injy and Onaizan, Yaser and Alhafni, Bashar and Antoun, Wissam and Khalifa, Salam and Haddad, Hatem and Zitouni, Imed and AlKhamissi, Badr and Almatham, Rawan and Mrini, Khalil", booktitle = "Proceedings of The Second Arabic Natural Language Processing Conference", month = aug, year = "2024", address = "Bangkok, Thailand", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.arabicnlp-1.85", doi = "10.18653/v1/2024.arabicnlp-1.85", pages = "758--763", abstract = "This paper describes our submissions to the Multi-label Country-level Dialect Identification subtask of the NADI2024 shared task, organized during the second edition of the ArabicNLP conference. Our submission is based on the ensemble of fine-tuned BERT-based models, after implementing the Similarity-Induced Mono-to-Multi Label Transformation (SIMMT) on the input data. Our submission ranked first with a Macro-Average (MA) F1 score of 50.57{\%}.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="karoui-etal-2024-elyadata"> <titleInfo> <title>ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.</title> </titleInfo> <name type="personal"> <namePart type="given">Amira</namePart> <namePart type="family">Karoui</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Farah</namePart> <namePart type="family">Gharbi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rami</namePart> <namePart type="family">Kammoun</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Imen</namePart> <namePart type="family">Laouirine</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Fethi</namePart> <namePart type="family">Bougares</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2024-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of The Second Arabic Natural Language Processing Conference</title> </titleInfo> <name type="personal"> <namePart type="given">Nizar</namePart> <namePart type="family">Habash</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Houda</namePart> <namePart type="family">Bouamor</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ramy</namePart> <namePart type="family">Eskander</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nadi</namePart> <namePart type="family">Tomeh</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ibrahim</namePart> <namePart type="family">Abu Farha</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ahmed</namePart> <namePart type="family">Abdelali</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Samia</namePart> <namePart type="family">Touileb</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Injy</namePart> <namePart type="family">Hamed</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yaser</namePart> <namePart type="family">Onaizan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Bashar</namePart> <namePart type="family">Alhafni</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Wissam</namePart> <namePart type="family">Antoun</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Salam</namePart> <namePart type="family">Khalifa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hatem</namePart> <namePart type="family">Haddad</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Imed</namePart> <namePart type="family">Zitouni</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Badr</namePart> <namePart type="family">AlKhamissi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rawan</namePart> <namePart type="family">Almatham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Khalil</namePart> <namePart type="family">Mrini</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Bangkok, Thailand</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This paper describes our submissions to the Multi-label Country-level Dialect Identification subtask of the NADI2024 shared task, organized during the second edition of the ArabicNLP conference. Our submission is based on the ensemble of fine-tuned BERT-based models, after implementing the Similarity-Induced Mono-to-Multi Label Transformation (SIMMT) on the input data. Our submission ranked first with a Macro-Average (MA) F1 score of 50.57%.</abstract> <identifier type="citekey">karoui-etal-2024-elyadata</identifier> <identifier type="doi">10.18653/v1/2024.arabicnlp-1.85</identifier> <location> <url>https://aclanthology.org/2024.arabicnlp-1.85</url> </location> <part> <date>2024-08</date> <extent unit="page"> <start>758</start> <end>763</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation. %A Karoui, Amira %A Gharbi, Farah %A Kammoun, Rami %A Laouirine, Imen %A Bougares, Fethi %Y Habash, Nizar %Y Bouamor, Houda %Y Eskander, Ramy %Y Tomeh, Nadi %Y Abu Farha, Ibrahim %Y Abdelali, Ahmed %Y Touileb, Samia %Y Hamed, Injy %Y Onaizan, Yaser %Y Alhafni, Bashar %Y Antoun, Wissam %Y Khalifa, Salam %Y Haddad, Hatem %Y Zitouni, Imed %Y AlKhamissi, Badr %Y Almatham, Rawan %Y Mrini, Khalil %S Proceedings of The Second Arabic Natural Language Processing Conference %D 2024 %8 August %I Association for Computational Linguistics %C Bangkok, Thailand %F karoui-etal-2024-elyadata %X This paper describes our submissions to the Multi-label Country-level Dialect Identification subtask of the NADI2024 shared task, organized during the second edition of the ArabicNLP conference. Our submission is based on the ensemble of fine-tuned BERT-based models, after implementing the Similarity-Induced Mono-to-Multi Label Transformation (SIMMT) on the input data. Our submission ranked first with a Macro-Average (MA) F1 score of 50.57%. %R 10.18653/v1/2024.arabicnlp-1.85 %U https://aclanthology.org/2024.arabicnlp-1.85 %U https://doi.org/10.18653/v1/2024.arabicnlp-1.85 %P 758-763
Markdown (Informal)
[ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.](https://aclanthology.org/2024.arabicnlp-1.85) (Karoui et al., ArabicNLP-WS 2024)
- ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation. (Karoui et al., ArabicNLP-WS 2024)
ACL
- Amira Karoui, Farah Gharbi, Rami Kammoun, Imen Laouirine, and Fethi Bougares. 2024. ELYADATA at NADI 2024 shared task: Arabic Dialect Identification with Similarity-Induced Mono-to-Multi Label Transformation.. In Proceedings of The Second Arabic Natural Language Processing Conference, pages 758–763, Bangkok, Thailand. Association for Computational Linguistics.