Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

Fuwen Luo; Zihao Wan; Ziyue Wang; Yaluo Liu; Pau Tong Lin Xu; Xuanjia Qiao; Xiaolong Wang; Peng Li; Yang Liu

Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

Fuwen Luo, Zihao Wan, Ziyue Wang, Yaluo Liu, Pau Tong Lin Xu, Xuanjia Qiao, Xiaolong Wang, Peng Li, Yang Liu

Abstract

Hieroglyphs, as logographic writing systems, encode rich semantic and cultural information within their internal structural composition. Yet, current advanced Large Language Models (LLMs) and Multimodal LLMs (MLLMs) usually remain structurally blind to this information. LLMs process characters as textual tokens, while MLLMs additionally view them as raw pixel grids. Both fall short to model the underlying logic of character strokes. Furthermore, existing structural analysis methods are often script-specific and labor-intensive. In this paper, we propose Hieroglyphic Stroke Analyzer (HieroSA), a novel and generalizable framework that enables MLLMs to automatically derive stroke-level structures from character bitmaps without handcrafted data. It transforms modern logographic and ancient hieroglyphs character images into explicit, interpretable line-segment representations in a normalized coordinate space, allowing for cross-lingual generalization. Extensive experiments demonstrate that HieroSA effectively captures character-internal structures and semantics, bypassing the need for language-specific priors. Experimental results highlight the potential of our work as a graphematics analysis tool for a deeper understanding of hieroglyphic scripts.

Anthology ID:: 2026.findings-acl.1383
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27794–27810
Language:
URL:: https://aclanthology.org/2026.findings-acl.1383/
DOI:
Bibkey:
Cite (ACL):: Fuwen Luo, Zihao Wan, Ziyue Wang, Yaluo Liu, Pau Tong Lin Xu, Xuanjia Qiao, Xiaolong Wang, Peng Li, and Yang Liu. 2026. Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors. In Findings of the Association for Computational Linguistics: ACL 2026, pages 27794–27810, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors (Luo et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1383.pdf
Checklist:: 2026.findings-acl.1383.checklist.pdf

PDF Cite Search Checklist Fix data