Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese

Zhang Jingshen, Chen Xinglu, Qiu Xinying, Wang Zhimin, Feng Wenhe


Abstract
“Chinese sentence simplification faces challenges due to the lack of large-scale labeledparallel corpora and the prevalence of idioms. To address these challenges, we pro-pose Readability-guided Idiom-aware Sentence Simplification (RISS), a novel frameworkthat combines data augmentation techniques. RISS introduces two key components: (1)Readability-guided Paraphrase Selection (RPS), a method for mining high-quality sen-tence pairs, and (2) Idiom-aware Simplification (IAS), a model that enhances the compre-hension and simplification of idiomatic expressions. By integrating RPS and IAS usingmulti-stage and multi-task learning strategies, RISS outperforms previous state-of-the-artmethods on two Chinese sentence simplification datasets. Furthermore, RISS achievesadditional improvements when fine-tuned on a small labeled dataset. Our approachdemonstrates the potential for more effective and accessible Chinese text simplification.”
Anthology ID:
2024.ccl-1.92
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1183–1200
Language:
English
URL:
https://aclanthology.org/2024.ccl-1.92/
DOI:
Bibkey:
Cite (ACL):
Zhang Jingshen, Chen Xinglu, Qiu Xinying, Wang Zhimin, and Feng Wenhe. 2024. Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 1183–1200, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese (Jingshen et al., CCL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ccl-1.92.pdf