CalligraphicOCR for Chinese Calligraphy Recognition

Xiaoyi Bao; Zhongqing Wang; Jinghang Gu; Chu-Ren Huang

doi:10.18653/v1/2025.emnlp-main.245

CalligraphicOCR for Chinese Calligraphy Recognition

Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, Chu-Ren Huang

Abstract

With thousand years of history, calligraphy serve as one of the representative symbols of Chinese culture. Increasing works try to digitize calligraphy by recognizing the context of calligraphy for better preservation and propagation. However, previous works stick to isolated single character recognition, not only requires unpractical manual splitting into characters, but also abandon the enriched context information that could be supplementary. To this end, we construct the pioneering end-to-end calligraphy recognition benchmark dataset, this dataset is challenging due to both the visual variations such as different writing styles and the textual understanding such as the domain shift in semantics. We further propose CalligraphicOCR (COCR) equipped with calligraphic image augmentation and action-based corrector targeted at the challenging root of this setting. Experiments demonstrate the advantage of our proposed model over cutting-edge baselines, underscoring the necessity of introducing this new setting, thereby facilitating a solid precondition for protecting and propagating the already scarce resources.

Anthology ID:: 2025.emnlp-main.245
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4865–4877
Language:
URL:: https://aclanthology.org/2025.emnlp-main.245/
DOI:: 10.18653/v1/2025.emnlp-main.245
Bibkey:
Cite (ACL):: Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, and Chu-Ren Huang. 2025. CalligraphicOCR for Chinese Calligraphy Recognition. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 4865–4877, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: CalligraphicOCR for Chinese Calligraphy Recognition (Bao et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.245.pdf
Checklist:: 2025.emnlp-main.245.checklist.pdf

PDF Cite Search Checklist Fix data