Yixia Wang


2024

pdf bib
Simplified Chinese Character Distance Based on Ideographic Description Sequences
Yixia Wang | Emmanuel Keuleers
Proceedings of the Second Workshop on Computation and Written Language (CAWL) @ LREC-COLING 2024

Character encoding systems have long overlooked the internal structure of characters. Ideographic Description Sequences, which explicitly represent spatial relations between character components, are a potential solution to this problem. In this paper, we illustrate the utility of Ideographic Description Sequences in computing edit distance and finding orthographic neighbors for Simplified Chinese characters. In addition, we explore the possibility of using Ideographic Description Sequences to encode spatial relations between components in other scripts.
Search
Co-authors
Venues