Nuowei Liu
2025
DVAGen: Dynamic Vocabulary Augmented Generation
Wei Du | Nuowei Liu | Jie Wang | Jiahao Kuang | Tao Ji | Xiaoling Wang | Yuanbin Wu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Wei Du | Nuowei Liu | Jie Wang | Jiahao Kuang | Tao Ji | Xiaoling Wang | Yuanbin Wu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Language models trained with a fixed vocabulary struggle to generalize to novel or out-of-vocabulary words, limiting their flexibility in handling diverse token combinations. Existing dynamic vocabulary approaches attempt to address this limitation but face challenges such as fragmented codebases, lack of support for modern LLMs, and limited inference scalability. To overcome these issues, we introduce DVAGen, a fully open-source, unified framework designed for training, evaluation, and visualization of dynamic vocabulary-augmented language models. Our framework modularizes the pipeline for ease of customization, integrates seamlessly with open-source LLMs, and is the first to provide both CLI and WebUI tools for real-time result inspection. We validate the effectiveness of dynamic vocabulary methods on modern LLMs and demonstrate support for batch inference, significantly improving inference throughput.
Overview of CCL25-Eval Task6: Chinese Essay Rhetoric Recognition Evaluation (CERRE)
Yujiang Lu | Nuowei Liu | Yupei Ren | Yicheng Zhu | Man Lan | Xiaopeng Bai | Mofan Xu | Qingyu Liao
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Yujiang Lu | Nuowei Liu | Yupei Ren | Yicheng Zhu | Man Lan | Xiaopeng Bai | Mofan Xu | Qingyu Liao
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
"Literary grace in Chinese composition writing is a hallmark of linguistic sophistication, often realized through various rhetorical devices. The automatic identification and analysis of rhetorical devices in essays play a crucial role in educational NLP applications, particularly for assessing writing proficiency and facilitating pedagogical interventions. Although prior research has predominantly focused on coarse-grained recognition of limited rhetorical devices at sentence level, these approaches prove inadequate for handling complex rhetorical structures and emerging educational demands. In this paper, we present the CCL25-Eval Task6: Chinese EssayRhetoric Recognition Evaluation (CERRE), a novel framework comprising three distinct evaluation tracks at the document level: (1) Fine-grained Form-level Categories Recognition, (2)Fine-grained Content-level Categories Recognition, and (3) Rhetorical Component Extraction.The evaluation has attracted 29 registered participating teams, with 8 teams submitting valid system outputs. In particular, two participating systems demonstrated superior performance by exceeding the baseline metrics in complete evaluation criteria."
2024
CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Nuowei Liu | Xinhao Chen | Hongyi Wu | Changzhi Sun | Man Lan | Yuanbin Wu | Xiaopeng Bai | Shaoguang Mao | Yan Xia
Findings of the Association for Computational Linguistics: EMNLP 2024
Nuowei Liu | Xinhao Chen | Hongyi Wu | Changzhi Sun | Man Lan | Yuanbin Wu | Xiaopeng Bai | Shaoguang Mao | Yan Xia
Findings of the Association for Computational Linguistics: EMNLP 2024
Chinese Essay Rhetoric Recognition and Understanding (CERRU)
Nuowei Liu | Xinhao Chen | Yupei Ren | Man Lan | Xiaopeng Bai | Yuanbin Wu | Shaoguang Mao | Yan Xia
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Nuowei Liu | Xinhao Chen | Yupei Ren | Man Lan | Xiaopeng Bai | Yuanbin Wu | Shaoguang Mao | Yan Xia
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“Rhetoric is fundamental to the reading comprehension and writing skills of primary and middle school students. However, current work independently recognize single coarse-grained categories or fine-grained categories. In this paper, we propose the CCL24-Eval Task6: Chinese Essay Rhetoric Recognition and Understanding (CERRU), consisting of 3 tracks: (1) Fine-grained Form-level Categories Recognition, (2) Fine-grained Content-level Categories Recognition and (3) Rhetorical Component Extraction. A total of 32 teams registered to participate in CERRU and 9 teams submitted evaluation results, with 7 of these teams achieving an overall score that surpassed the baseline.”