Tracing Traditions: Automatic Extraction of Isnads from Classical Arabic Texts

Ryan Muther, David Smith


Abstract
We present our work on automatically detecting isnads, the chains of authorities for a re-port that serve as citations in hadith and other classical Arabic texts. We experiment with both sequence labeling methods for identifying isnads in a single pass and a hybrid “retrieve-and-tag” approach, in which a retrieval model first identifies portions of the text that are likely to contain start points for isnads, then a sequence labeling model identifies the exact starting locations within these much smaller retrieved text chunks. We find that the usefulness of full-document sequence to sequence models is limited due to memory limitations and the ineffectiveness of such models at modeling very long documents. We conclude by sketching future improvements on the tagging task and more in-depth analysis of the people and relationships involved in the social network that influenced the evolution of the written tradition over time.
Anthology ID:
2020.wanlp-1.12
Volume:
Proceedings of the Fifth Arabic Natural Language Processing Workshop
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Imed Zitouni, Muhammad Abdul-Mageed, Houda Bouamor, Fethi Bougares, Mahmoud El-Haj, Nadi Tomeh, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
130–138
Language:
URL:
https://aclanthology.org/2020.wanlp-1.12
DOI:
Bibkey:
Cite (ACL):
Ryan Muther and David Smith. 2020. Tracing Traditions: Automatic Extraction of Isnads from Classical Arabic Texts. In Proceedings of the Fifth Arabic Natural Language Processing Workshop, pages 130–138, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Tracing Traditions: Automatic Extraction of Isnads from Classical Arabic Texts (Muther & Smith, WANLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wanlp-1.12.pdf