A Mutual Information-based Approach to Quantifying Logography in Japanese and Sumerian

Noah Hermalin


Abstract
Writing systems have traditionally been classified by whether they prioritize encoding phonological information (phonographic) versus morphological or semantic information (logographic). Recent work has broached the question of how membership in these categories can be quantified. We aim to contribute to this line of research by treating a definition of logography which directly incorporates morphological identity. Our methods compare mutual information between graphic forms and phonological forms and between graphic forms and morphological identity. We report on preliminary results here for two case studies, written Sumerian and written Japanese. The results suggest that our methods present a promising means of classifying the degree to which a writing system is logographic or phonographic.
Anthology ID:
2023.cawl-1.12
Volume:
Proceedings of the Workshop on Computation and Written Language (CAWL 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Kyle Gorman, Richard Sproat, Brian Roark
Venue:
CAWL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
105–110
Language:
URL:
https://aclanthology.org/2023.cawl-1.12
DOI:
10.18653/v1/2023.cawl-1.12
Bibkey:
Cite (ACL):
Noah Hermalin. 2023. A Mutual Information-based Approach to Quantifying Logography in Japanese and Sumerian. In Proceedings of the Workshop on Computation and Written Language (CAWL 2023), pages 105–110, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Mutual Information-based Approach to Quantifying Logography in Japanese and Sumerian (Hermalin, CAWL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.cawl-1.12.pdf