TafsirExtractor: Text Preprocessing Pipeline preparing Classical Arabic Literature for Machine Learning Applications Carl Kruse author Sajawel Ahmed author 2024-05 text Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024 Hend Al-Khalifa editor Kareem Darwish editor Hamdy Mubarak editor Mona Ali editor Tamer Elsayed editor ELRA and ICCL Torino, Italia conference publication kruse-ahmed-2024-tafsirextractor https://aclanthology.org/2024.osact-1.8/ 2024-05 67 73