Hsueh-Chih Chen


2022

pdf bib
The Design and Development of a System for Chinese Character Difficulty and Features
Jung-En Haung | Hou-Chiang Tseng | Li-Yun Chang | Hsueh-Chih Chen | Yao-Ting Sung
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Feature analysis of Chinese characters plays a prominent role in “character-based” education. However, there is an urgent need for a text analysis system for processing the difficulty of composing components for characters, primarily based on Chinese learners’ performance. To meet this need, the purpose of this research was to provide such a system by adapting a data-driven approach. Based on Chen et al.’s (2011) Chinese Orthography Database, this research has designed and developed an system: Character Difficulty - Research on Multi-features (CD-ROM). This system provides three functions: (1) analyzing a text and providing its difficulty regarding Chinese characters; (2) decomposing characters into components and calculating the frequency of components based on the analyzed text; and (3) affording component-deriving characters based on the analyzed text and downloadable images as teaching materials. With these functions highlighting multi-level features of characters, this system has the potential to benefit the fields of Chinese character instruction, Chinese orthographic learning, and Chinese natural language processing.

2020

pdf bib
Development and Validation of a Corpus for Machine Humor Comprehension
Yuen-Hsien Tseng | Wun-Syuan Wu | Chia-Yueh Chang | Hsueh-Chih Chen | Wei-Lun Hsu
Proceedings of the Twelfth Language Resources and Evaluation Conference

This work developed a Chinese humor corpus containing 3,365 jokes collected from over 40 sources. Each joke was labeled with five levels of funniness, eight skill sets of humor, and six dimensions of intent by only one annotator. To validate the manual labels, we trained SVM (Support Vector Machine) and BERT (Bidirectional Encoder Representations from Transformers) with half of the corpus (labeled by one annotator) to predict the skill and intent labels of the other half (labeled by the other annotator). Based on two assumptions that a valid manually labeled corpus should follow, our results showed the validity for the skill and intent labels. As to the funniness label, the validation results showed that the correlation between the corpus label and user feedback rating is marginal, which implies that the funniness level is a harder annotation problem to be solved. The contribution of this work is two folds: 1) a Chinese humor corpus is developed with labels of humor skills, intents, and funniness, which allows machines to learn more intricate humor framing, effect, and amusing level to predict and respond in proper context (https://github.com/SamTseng/Chinese_Humor_MultiLabeled). 2) An approach to verify whether a minimum human labeled corpus is valid or not, which facilitates the validation of low-resource corpora.

2015

pdf bib
Introduction to a Proofreading Tool for Chinese Spelling Check Task of SIGHAN-8
Tao-Hsing Chang | Hsueh-Chih Chen | Cheng-Han Yang
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

2014

pdf bib
使用中文字筆畫構形資料庫校正字形相似之別字 (Using Chinese Orthography Database to Correct Chinese Misspelling Words With Graphemic Similarity) [In Chinese]
Tao-Hsing Chang | Hsueh-Chih Chen | Jian-Liang Zheng
Proceedings of the 26th Conference on Computational Linguistics and Speech Processing (ROCLING 2014)

2013

pdf bib
基於特徵為本及使用SVM 的文本對蘊涵關係的自動推論方法 (Textual Entailment Recognition Using Textual Features and SVM) [In Chinese]
Tao-Hsing Chang | Yao-Chi Hsu | Chung-Wei Chang | Yao-Chuan Hsu | Hsueh-Chih Chen
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013)

pdf bib
Automatic Detection and Correction for Chinese Misspelled Words Using Phonological and Orthographic Similarities
Tao-Hsing Chang | Hsueh-Chih Chen | Yuen-Hsien Tseng | Jian-Liang Zheng
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

2012

pdf bib
字形相似別字之自動校正方法 (Automatic Correction for Graphemic Chinese Misspelled Words) [In Chinese]
Tao-Hsing Chang | Shou-Yen Su | Hsueh-Chih Chen
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING 2012)