Navid Ebrahimi
2025
DadmaTools V2: an Adapter-Based Natural Language Processing Toolkit for the Persian Language
Sadegh Jafari
|
Farhan Farsi
|
Navid Ebrahimi
|
Mohamad Bagher Sajadi
|
Sauleh Eetemadi
Proceedings of the 1st Workshop on NLP for Languages Using Arabic Script
DadmaTools V2 is a comprehensive repository designed to enhance NLP capabilities for the Persian language, catering to industry practitioners seeking practical and efficient solutions. The toolkit provides extensive code examples demonstrating the integration of its models with popular NLP frameworks such as Trankit and Transformers, as well as deep learning frameworks like PyTorch. Additionally, DadmaTools supports widely used Persian embeddings and datasets, ensuring robust language processing capabilities. The latest version of DadmaTools introduces an adapter-based technique, significantly reducing memory usage by employing a shared pre-trained model across various tasks, supplemented with task-specific adapter layers. This approach eliminates the need to maintain multiple pre-trained models and optimize resource utilization. Enhancements in this version include adding new modules such as a sentiment detector, an informal-to-formal text converter, and a spell checker, further expanding the toolkit’s functionality. DadmaTools V2 thus represents a powerful, efficient, and versatile resource for advancing Persian NLP applications.