Chia-Hui Chang


2025

本研究提出了一套名為「交通事故資訊蒐集代理人」(Collision Care Guide, CCG)的系統架構,專注於事故初期階段的結構化資訊蒐集。CCG 整合三大模組:問題生成、資訊擷取及事故重建,透過多輪對話引導使用者敘述事故細節並轉換為結構化資料格式(TARF),同時生成可讀性敘述供核對。為滿足成本效益、隱私保護及部署彈性需求,本研究比較開源 Llama 模型(3B/8B 參數,完整微調及 4-bit PEFT 方法)與商業基準 GPT-4o-mini 的效能表現。結果顯示,資訊擷取模組欄位準確率高於 0.94,JSON 語義相似度達 0.995;問題生成模組語義相似度介於 0.85-0.88,問題表達更加精煉。微調模型在對話品質與資訊擷取的 LLM 評估中均獲得 4 分以上(滿分 5 分),與商業基準差距小於 0.5 分。研究證實開源模型經微調後能逼近商業模型效能,且量化版本在資源受限場景中具備高效能與部署潛力。CCG 的設計填補了事故初期互動式資訊蒐集的技術空白,為交通事故處理提供了高效且具成本優勢的解決方案。
Despite recent advances in AI, ASR systems still struggle with real-world errors from pronunciation and homophones. To solve this issue, we propose a verbal-command-based correction system that enables users to utter natural-language instructions to refine recognition outputs with minimal effort. The system consists of three modules: an input classifier, a command classifier, and a correction labeler. To support training and evaluation, we simulate ASR errors via TTS and ASR pipelines to simulate the potential errors, followed by verbal correction commands issued based on linguistic features or LLMs. Experiments show that the overall system achieves over 80% correction accuracy and delivers stable performance. Compared to manual correction, this system also demonstrates highly competitive correction speed, which sufficiently indicates its feasibility for practical deployment.

2023

Retelling a story is one way to develop narrative skills in students, but it may present some challenges for English as Second Language (ESL) students who are learning new stories and vocabularies at the same time. The goal of this research is to develop a dialogue module for story co-telling for ESL students in order to help students to co-narrate an English story and enhance their narrative skills. However, story co-telling is a relatively underexplored and novel task. In order to understand the story content and select the right plot to continue the story co-telling based on the current dialogue, we utilize open domain information extraction techniques to construct a knowledge graph, and adopt multi-agent reinforcement learning methods to train two agents to select relevant facts from the knowledge graph and generate responses, jointly accomplishing the task of story co-telling. Compared to models that reply on chronological order, our model improves the performance from 67.0% to 70.8% through self-training with reward evaluation, achieving an increase of approximately 3.8%.

2022

For educators, how to generate high quality question-answer pairs from story text is a time-consuming and labor-intensive task. The purpose is not to make students unable to answer, but to ensure that students understand the story text through the generated question-answer pairs. In this paper, we improve the FairyTaleQA question generation method by incorporating question type and its definition to the input for fine-tuning the BART (Lewis et al., 2020) model. Furthermore, we make use of the entity and relation extraction from (Zhong and Chen, 2021) as an element of template-based question generation.
Due to the lack of conversation practice, the main challenge for the second-language learners is speaking. Our goal is to develop a chatbot to encourage individuals to reflect, describe, analyse and communicate what they read as well as improve students’ English expression skills. In this paper, we exploit COMMET, an inferential commonsense knowledge generator, as the background knowledge to improve the generation diversity. We consider two approaches to increase the diversity of empathetic response generation. For nonpretrained models, We apply AdaLabel (Wang et al., 2021) to Commonsense-aware Empathetic model (Sabour et al., 2022) and improve Distinct-2 score from 2.99 to 4.08 on EMPATHETIC DIALOGUES (ED). Furthermore, we augment the pretrained BART model with various commonsense knowledge to generate more informative empathetic responses. Not only has the automatic evaluation of distinct-2 scores improved from 9.11 to 11.21, but the manual case study also shows that CE-BART significantly outperform CEM-AdaLabel.

2021

For manufacturers of home appliances, the Studying discussion of products on social media can help manufacturers improve their products. Opinions provided through online reviews can immediately reflect whether the product is accepted by people, and which aspect of the product are most discussed . In this article, we divide the analysis of home appliances into three tasks, including named entity recognition (NER), aspect category extraction (ACE), and aspect category sentiment classification (ACSC). To improve the performance of ACSC, we combine the Reptile algorithm in meta learning with the concept of domain adversarial training to form the concept of the Adversarial Reptile algorithm. We find show that the macro-f1 is improved from 68.6% (BERT fine tuned model) to 70.3% (p-value 0.04).
When we are interested in a certain domain, we can collect and analyze data from the Internet. The newly collected data is not labeled, so the use of labeled data is hoped to be helpful to the new data. We perform name entity recognition (NER) and aspect-based sentiment analysis (ABSA) in multi-task learning, and combine parameter generation network and DANN architecture to build the model. In the NER task, the data is labeled with Tie, Break, and the task weight is adjusted according to the loss change rate of each task using Dynamic Weight Average (DWA). This study used two different source domain data sets. The experimental results show that Tie, Break can improve the results of the model; DWA can have better performance in the results; the combination of parameter generation network and gradient reversal layer can be used for every good learning in different domain.

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2006