MOZ-Smishing: A Benchmark Dataset for Detecting Mobile Money Frauds

Felermino D. M. A. Ali; Saide M. Saide; Rui Sousa-Silva; Henrique Lopes Cardoso

doi:10.18653/v1/2025.africanlp-1.23

MOZ-Smishing: A Benchmark Dataset for Detecting Mobile Money Frauds

Felermino D. M. A. Ali, Saide M. Saide, Rui Sousa-Silva, Henrique Lopes Cardoso

Abstract

Despite the increasing prevalence of smishing attacks targeting Mobile Money Transfer systems, there is a notable lack of publicly available SMS phishing datasets in this domain. This study seeks to address this gap by creating a specialized dataset designed to detect smishing attacks aimed at Mobile Money Transfer users. The data set consists of crowd-sourced text messages from Mozambican mobile users, meticulously annotated into two categories: legitimate messages (ham) and fraudulent smishing attempts (spam). The messages are written in Portuguese, often incorporating microtext styles and linguistic nuances unique to the Mozambican context.We also investigate the effectiveness of LLMs in detecting smishing. Using in-context learning approaches, we evaluate the models’ ability to identify smishing attempts without requiring extensive task-specific training. The data set is released under an open license at the following link: huggingface-Anonymous

Anthology ID:: 2025.africanlp-1.23
Volume:: Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Constantine Lignos, Idris Abdulmumin, David Adelani
Venues:: AfricaNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 158–166
Language:
URL:: https://aclanthology.org/2025.africanlp-1.23/
DOI:: 10.18653/v1/2025.africanlp-1.23
Bibkey:
Cite (ACL):: Felermino D. M. A. Ali, Saide M. Saide, Rui Sousa-Silva, and Henrique Lopes Cardoso. 2025. MOZ-Smishing: A Benchmark Dataset for Detecting Mobile Money Frauds. In Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), pages 158–166, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: MOZ-Smishing: A Benchmark Dataset for Detecting Mobile Money Frauds (Ali et al., AfricaNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.africanlp-1.23.pdf

PDF Cite Search Fix data