UFCNet: Unsupervised Network based on Fourier transform and Convolutional attention for Oracle Character Recognition

Yanan Zhou, Guoqi Liu, Yiping Yang, Linyuan Ru, Dong Liu, Xueshan Li


Abstract
Oracle bone script (OBS) is the earliest writing system in China, which is of great value in the improvement of archaeology and Chinese cultural history. However, there are some problems such as the lack of labels and the difficulty to distinguish the glyphs from the background of OBS, which makes the automatic recognition of OBS in the real world not achieve the satisfactory effect. In this paper, we propose a character recognition method based on an unsupervised domain adaptive network (UFCNet). Firstly, a convolutional attention fusion module (CAFM) is designed in the encoder to obtain more global features through multi-layer feature fusion. Second, we construct a Fourier transform (FT) module that focuses on the differences between glyphs and backgrounds. Finally, to further improve the network’s ability to recognize character edges, we introduce a kernel norm-constrained loss function. Extensive experiments perform on the Oracle-241 dataset show that the proposed method is superior to other adaptive methods. The code will be available at https://github.com/zhouynan/UFCNet.
Anthology ID:
2024.ml4al-1.11
Volume:
Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Month:
August
Year:
2024
Address:
Hybrid in Bangkok, Thailand and online
Editors:
John Pavlopoulos, Thea Sommerschield, Yannis Assael, Shai Gordin, Kyunghyun Cho, Marco Passarotti, Rachele Sprugnoli, Yudong Liu, Bin Li, Adam Anderson
Venues:
ML4AL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
98–106
Language:
URL:
https://aclanthology.org/2024.ml4al-1.11
DOI:
Bibkey:
Cite (ACL):
Yanan Zhou, Guoqi Liu, Yiping Yang, Linyuan Ru, Dong Liu, and Xueshan Li. 2024. UFCNet: Unsupervised Network based on Fourier transform and Convolutional attention for Oracle Character Recognition. In Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024), pages 98–106, Hybrid in Bangkok, Thailand and online. Association for Computational Linguistics.
Cite (Informal):
UFCNet: Unsupervised Network based on Fourier transform and Convolutional attention for Oracle Character Recognition (Zhou et al., ML4AL-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ml4al-1.11.pdf