Fan-Yun Sun
Also published as: Fan-yun Sun
2024
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition
Hsuan Su
|
Hua Farn
|
Fan-Yun Sun
|
Shang-Tse Chen
|
Hung-yi Lee
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains. However, existing methods suffer in performance when they fine-tune an automatic speech recognition (ASR) model on synthetic data as they suffer from the distributional shift commonly referred to as the synthetic-to-real gap. In this paper, we find that task arithmetic is effective at mitigating this gap. Our proposed method, SYN2REAL task vector, shows an average improvement of 10.03% improvement in word error rate over baselines on the SLURP dataset. Additionally, we show that an average of SYN2REAL task vectors, when we have real speeches from multiple different domains, can further adapt the original ASR model to perform better on the target text domain.
2021
Put Chatbot into Its Interlocutor’s Shoes: New Framework to Learn Chatbot Responding with Intention
Hsuan Su
|
Jiun-Hao Jhan
|
Fan-yun Sun
|
Saurav Sahay
|
Hung-yi Lee
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Most chatbot literature that focuses on improving the fluency and coherence of a chatbot, is dedicated to making chatbots more human-like. However, very little work delves into what really separates humans from chatbots – humans intrinsically understand the effect their responses have on the interlocutor and often respond with an intention such as proposing an optimistic view to make the interlocutor feel better. This paper proposes an innovative framework to train chatbots to possess human-like intentions. Our framework includes a guiding chatbot and an interlocutor model that plays the role of humans. The guiding chatbot is assigned an intention and learns to induce the interlocutor to reply with responses matching the intention, for example, long responses, joyful responses, responses with specific words, etc. We examined our framework using three experimental setups and evaluated the guiding chatbot with four different metrics to demonstrate flexibility and performance advantages. Additionally, we performed trials with human interlocutors to substantiate the guiding chatbot’s effectiveness in influencing the responses of humans to a certain extent. Code will be made available to the public.
Search
Co-authors
- Hsuan Su 2
- Hung-Yi Lee 2
- Hua Farn 1
- Shang-Tse Chen 1
- Jiun-Hao Jhan 1
- show all...