Wen-Hsiang Lu


2025

pdf bib
A Multi-Module Error Detection and Correction System for Hakka ASR
Min-Chun Hu | Yu-Lin Xiao | Wen-Hsiang Lu
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

本研究提出一個針對客語(以大埔/詔安腔為主)的自動語音辨識(ASR)後矯正系統,旨在解決低資源語言辨識錯誤率偏高的問題。客語因受限於語料規模、異體字與腔調差異,在既有的通用 ASR 模型上表現往往不佳。為此,我們首先以 Whisper Large v3 Turbo 為基底辨識模型,使用約 60 小時的大埔與詔安語料進行微調,以提升對特定腔調的適應性。在獲取 ASR N-best 候選句後,系統進一步透過多模組錯誤偵測矯正流程進行修正,包含四個主要步驟: (1) 潛在錯誤偵測,用於鎖定候選間錯誤的候選詞彙;(2) 音素混淆集偵測(Phoneme Confusion Set): 依據音素相近關係提供可能替代詞;(3) 辭典(Lexicon)修正: 確保詞彙存在於語言使用的實際範疇中,(4) 搭配詞關聯度偵測: 利用收集之語料所建立的搭配詞關聯度來偵測錯誤詞彙。本研究所提出的矯正機制能有效補足 ASR 在低資源語言中的不足,實驗顯示經過多階段錯誤偵測矯正後,最終CER減少至 15.49%,減少 2.14 % ,證明該方法能有效提升語音辨識的準確率。

2022

pdf bib
Using Grammatical and Semantic Correction Model to Improve Chinese-to-Taiwanese Machine Translation Fluency
Yuan-Han Li | Chung-Ping Young | Wen-Hsiang Lu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Currently, there are three major issues to tackle in Chinese-to-Taiwanese machine translation: multi-pronunciation Taiwanese words, unknown words, and Chinese-to-Taiwanese grammatical and semantic transformation. Recent studies have mostly focused on the issues of multi-pronunciation Taiwanese words and unknown words, while very few research papers focus on grammatical and semantic transformation. However, there exist grammatical rules exclusive to Taiwanese that, if not translated properly, would cause the result to feel unnatural to native speakers and potentially twist the original meaning of the sentence, even with the right words and pronunciations. Therefore, this study collects and organizes a few common Taiwanese sentence structures and grammar rules, then creates a grammar and semantic correction model for Chinese-to-Taiwanese machine translation, which would detect and correct grammatical and semantic discrepancies between the two languages, thus improving translation fluency.

pdf bib
Intelligent Future Recreation Harbor Application Service: Taking Kaohsiung Asia New Bay as an Example to Construct a Composite Recreational Knowledge Graph
Dian-Zhi Wu | Yu-De Lu | Chia-Ming Tung | Bo-Yang Huang | Hsun-Hui Huang | Chien-Der Lin | Wen-Hsiang Lu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

In view of the lack of overall specialized design services for harbour recreation in Taiwan nowadays, various marine recreational activities and marine scenic spots haven’t yet been planned and developed in the integration of services around the city and harbour. As there are not many state-of-the-art products and application services, and Taiwan’s harbour leisure services-related industries are facing the challenge of digital transformation. Institute for Information Industry proposed an innovative “Smart Future Recreational Harbour Application Service” project, taking Kaohsiung Asia’s New Bay Area as the main field of demonstration, Using multi-source knowledge graph integration and inference technology to recommend appropriate recreational service information, as a result, tourists can enjoy the best virtual reality intelligent human-machine interactive service experience during their trip.

2016

pdf bib
Identifying the Names of Complex Search Tasks with Task-Related Entities
Ting-Xuan Wang | Wen-Hsiang Lu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 21, Number 1, June 2016

2015

pdf bib
部落客憂鬱傾向分析與預測(Analysis and Prediction of Blogger’s Depression Tendency)[In Chinese]
Chia-Ming Tung | Wen-Hsiang Lu
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)

2014

pdf bib
Identifying Real-Life Complex Task Names with Task-Intrinsic Entities from Microblogs
Ting-Xuan Wang | Kun-Yu Tsai | Wen-Hsiang Lu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Location and Activity Recommendation by Using Consecutive Itinerary Matching Model
Jiun-Shian Liu | Wen-Hsiang Lu
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013)

2008

pdf bib
Improving Translation of Queries with Infrequent Unknown Abbreviations and Proper Names
Wen-Hsiang Lu | Jiun-Hung Lin | Yao-Sheng Chang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 13, Number 1, March 2008: Special Issue on Cross-Lingual Information Retrieval and Question Answering

2005

bib
Proceedings of the 17th Conference on Computational Linguistics and Speech Processing
Chung-Hsien Wu | Jen-Tzung Chien | Wen-Hsiang Lu
Proceedings of the 17th Conference on Computational Linguistics and Speech Processing

pdf bib
Improving Translation of Unknown Proper Names Using a Hybrid Web-based Translation Extraction Method
Min-Shiang Shia | Jiun-Hung Lin | Scott Yu | Wen-Hsiang Lu
Proceedings of the 17th Conference on Computational Linguistics and Speech Processing

2004

pdf bib
Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora
Pu-Jen Cheng | Wen-Hsiang Lu | Jei-Wen Teng | Lee-Feng Chien
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf bib
LiveTrans: Translation Suggestion for Cross-Language Web Search from Web Anchor Texts and Search Results
Wen-Hsiang Lu | Lee-Feng Chien | Hsi-Jian Lee
Proceedings of Research on Computational Linguistics Conference XV

2002

pdf bib
A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining
Wen-Hsiang Lu | Lee-Feng Chien | Hsi-Jian Lee
COLING 2002: The 19th International Conference on Computational Linguistics

1999

pdf bib
Recent Results on Domain-Specific Term Extraction from Online Chinese Text Resources
Lee-Feng Chien | Chung-Liang Chen | Wen-Hsiang Lu | Yuan-Lu Chang
Proceedings of Research on Computational Linguistics Conference XII