Wu Yunfang

2024

pdf bib abs
Going Beyond Passages: Readability Assessment for Book-level Long Texts
Li Wenbiao | Sun Rui | Zhang Tianyi | Wu Yunfang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“Readability assessment for book-level long text is widely needed in real educational applica-tions. However, most of the current researches focus on passage-level readability assessmentand little work has been done to process ultra-long texts. In order to process the long sequenceof book texts better and to enhance pretrained models with difficulty knowledge, we propose anovel model DSDR, difficulty-aware segment pre-training and difficulty multi-view representa-tion. Specifically, we split all books into multiple fixed-length segments and employ unsuper-vised clustering to obtain difficulty-aware segments, which are used to re-train the pretrainedmodel to learn difficulty knowledge. Accordingly, a long text is represented by averaging mul-tiple vectors of segments with varying difficulty levels. We construct a new dataset of GradedChildren’s Books to evaluate model performance. Our proposed model achieves promising re-sults, outperforming both the traditional SVM classifier and several popular pretrained models.In addition, our work establishes a new prototype for book-level readability assessment, whichprovides an important benchmark for related research in future work.”

2021

pdf bib abs
Enhancing Question Generation with Commonsense Knowledge
Jia Xin | Wang Hao | Yin Dawei | Wu Yunfang
Proceedings of the 20th Chinese National Conference on Computational Linguistics

Question generation (QG) is to generate natural and grammatical questions that can be answeredby a specific answer for a given context. Previous sequence-to-sequence models suffer from aproblem that asking high-quality questions requires commonsense knowledge as backgrounds which in most cases can not be learned directly from training data resulting in unsatisfactory questions deprived of knowledge. In this paper we propose a multi-task learning framework tointroduce commonsense knowledge into question generation process. We first retrieve relevant commonsense knowledge triples from mature databases and select triples with the conversion information from source context to question. Based on these informative knowledge triples wedesign two auxiliary tasks to incorporate commonsense knowledge into the main QG modelwhere one task is Concept Relation Classification and the other is Tail Concept Generation. Ex-perimental results on SQuAD show that our proposed methods are able to noticeably improvethe QG performance on both automatic and human evaluation metrics demonstrating that incor-porating external commonsense knowledge with multi-task learning can help the model generatehuman-like and high-quality questions.

Co-authors

Jia Xin 1

Venues

ccl2

Fix data