Hou-Chiang Tseng


2025

pdf bib
Training a Chinese Listenability Model Using Word2Vec to Predict the Difficulty of Spoken Texts
Yen-Hsiang Chien | Hou-Chiang Tseng | Kuan-Yu Chen | Yao-Ting Sung
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

With the proliferation of digital learning, an increasing number of learners are engaging with audio-visual materials. For preschool and lower elementary students, whose literacy skills are still limited, knowledge acquisition relies more heavily on spoken and visual content. Traditional readability models were primarily developed for written texts, and their applicability to spoken materials remains uncertain. To address this issue, this study investigates the impact of different word segmentation tools and language models on the performance of automatic grade classification models for Chinese spoken materials. Support Vector Machines were employed for grade prediction, aiming to automatically determine the appropriate grade level of learning resources and assist learners in selecting suitable materials. The results show that language models with higher-dimensional word embeddings achieved better classification performance, with an accuracy of up to 61% and an adjacent accuracy of 76%. These findings may contribute to future digital learning platforms or educational resource recommendation systems by automatically providing students with appropriate listening materials to enhance learning outcomes.

pdf bib
Exploring the Feasibility of Large Language Model- and Rubric-Based Automatic Assessment of Elementary Students’ Book Summaries
Qi-Zhen Huang | Hou-Chiang Tseng | Yao-Ting Sung
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

摘要寫作為閱讀與寫作整合的高層次語文任務,不僅可評量學生的文本理解能力,也能促進語言表達與重述能力的培養。過去自動摘要批改系統多依賴關鍵詞比對或語義重疊等「由下而上」的方法,較難以全面評估學生的理解深度與文本重述能力,且中文摘要寫作批改研究雖有,但相較於英文仍相對不足,形成研究缺口。隨著大型語言模型(Large Language Models, LLMs)的發展,其在語意理解與生成能力上的突破,為自動摘要批改與回饋帶來新契機。有鑑於此,本研究旨以由上而下的方式探討結合LLMs與閱讀摘要評分規準(Rubrics)對學生閱讀摘要批改與回饋之應用潛力,進一步而言,在考量教學資料隱私的情況下,本研究採用Meta-Llama-3.1-70B生成電腦摘要,並依據專家所制定的摘要評分規準,其評分涵蓋:理解與準確性、組織結構、簡潔性、語言表達與文法及重述能力五大構面,對學生閱讀摘要進行自動評分與回饋。研究結果顯示,Meta-Llama-3.1-70B能提供具體、清晰的即時回饋,不僅能指出摘要中遺漏的關鍵概念,也能針對結構安排與語法錯誤提出修正建議,協助學生快速掌握摘要改進方向;然而回饋多偏向表面語言與結構調整,在語言表達、修辭多樣性及重述能力等高層次語文能力評估上仍存在限制。整體而言,LLMs可作為形成性評量與教學輔助工具,提升評分效率,但需結合教師專業判斷與回饋以補足深層概念與策略性寫作指導,促進學生摘要寫作能力的發展。

2023

pdf bib
Impact of Feature Selection Algorithms on Readability Model
Tsai-Ning Tai | Hou-Chiang Tseng | Yao-Ting Sung
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

2022

pdf bib
The Design and Development of a System for Chinese Character Difficulty and Features
Jung-En Haung | Hou-Chiang Tseng | Li-Yun Chang | Hsueh-Chih Chen | Yao-Ting Sung
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Feature analysis of Chinese characters plays a prominent role in “character-based” education. However, there is an urgent need for a text analysis system for processing the difficulty of composing components for characters, primarily based on Chinese learners’ performance. To meet this need, the purpose of this research was to provide such a system by adapting a data-driven approach. Based on Chen et al.’s (2011) Chinese Orthography Database, this research has designed and developed an system: Character Difficulty - Research on Multi-features (CD-ROM). This system provides three functions: (1) analyzing a text and providing its difficulty regarding Chinese characters; (2) decomposing characters into components and calculating the frequency of components based on the analyzed text; and (3) affording component-deriving characters based on the analyzed text and downloadable images as teaching materials. With these functions highlighting multi-level features of characters, this system has the potential to benefit the fields of Chinese character instruction, Chinese orthographic learning, and Chinese natural language processing.

2019

pdf bib
基於階層式編碼架構之文本可讀性預測(A Hierarchical Encoding Framework for Text Readability Prediction)
Shi-Yan Weng | Hou-Chiang Tseng | Yao-Ting Sung | Berlin Chen
Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019)

2018

pdf bib
探索結合快速文本及卷積神經網路於可讀性模型之建立 (Exploring Combination of FastText and Convolutional Neural Networks for Building Readability Models) [In Chinese]
Hou-Chiang Tseng | Berlin Chen | Yao-Ting Sung
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018)

2017

pdf bib
探究不同領域文件之可讀性分析 (Exploring Readability Analysis on Multi-Domain Texts) [In Chinese]
Hou-Chiang Tseng | Yao-Ting Sung | Berlin Chen
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)

pdf bib
探究使用基於類神經網路之特徵於文本可讀性分類 (Exploring the Use of Neural Network based Features for Text Readability Classification) [In Chinese]
Hou-Chiang Tseng | Berlin Chen | Yao-Ting Sung
International Journal of Computational Linguistics & Chinese Language Processing, Volume 22, Number 2, December 2017-Special Issue on Selected Papers from ROCLING XXIX

2016

pdf bib
基於深層類神經網路及表示學習技術之文件可讀性分類(Classification of Text Readability Based on Deep Neural Network and Representation Learning Techniques)[In Chinese]
Hou-Chiang Tseng | Hsiao-Tsung Hung | Yao-Ting Sung | Berlin Chen
Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016)

2015

pdf bib
可讀性預測於中小學國語文教科書及優良課外讀物之研究(A Study of Readability Prediction on Elementary and Secondary Chinese Textbooks and Excellent Extracurricular Reading Materials) [In Chinese]
Yi-Nian Liu | Kuan-Yu Chen | Hou-Chiang Tseng | Berlin Chen
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)