Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure

Yang Hou; Zhenghua Li (李正华)

doi:10.18653/v1/2024.findings-acl.173

Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure

Abstract

Revealing the syntactic structure of sentences in Chinese poses significant challenges for word-level parsers due to the absence of clear word boundaries. To facilitate a transition from word-level to character-level Chinese dependency parsing, this paper proposes modeling latent internal structures within words. In this way, each word-level dependency tree is interpreted as a forest of character-level trees. A constrained Eisner algorithm is implemented to ensure the compatibility of character-level trees, guaranteeing a single root for intra-word structures and establishing inter-word dependencies between these roots. Experiments on Chinese treebanks demonstrate the superiority of our method over both the pipeline framework and previous joint models. A detailed analysis reveals that a coarse-to-fine parsing strategy empowers the model to predict more linguistically plausible intra-word structures.

Anthology ID:: 2024.findings-acl.173
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2943–2956
Language:
URL:: https://aclanthology.org/2024.findings-acl.173/
DOI:: 10.18653/v1/2024.findings-acl.173
Bibkey:
Cite (ACL):: Yang Hou and Zhenghua Li. 2024. Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure. In Findings of the Association for Computational Linguistics: ACL 2024, pages 2943–2956, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure (Hou & Li, Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-acl.173.pdf

PDF Cite Search Fix data