Kye Min Tan
2023
I2R’s End-to-End Speech Translation System for IWSLT 2023 Offline Shared Task
Muhammad Huzaifah
|
Kye Min Tan
|
Richeng Duan
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper describes I2R’s submission to the offline speech translation track for IWSLT 2023. We focus on an end-to-end approach for translation from English audio to German text, one of the three available language directions in this year’s edition. The I2R system leverages on pretrained models that have been exposed to large-scale audio and text data for our base model. We introduce several stages of additional pretraining followed by fine-tuning to adapt the system for the downstream speech translation task. The strategy is supplemented by other techniques such as data augmentation, domain tagging, knowledge distillation, and model ensemble, among others. We evaluate the system on several publicly available test sets for comparison.
Ensemble Method via Ranking Model for Conversational Modeling with Subjective Knowledge
Xin Huang
|
Kye Min Tan
|
Richeng Duan
|
Bowei Zou
Proceedings of The Eleventh Dialog System Technology Challenge
This paper describes our submission to the fifth track of the 11th Dialog System Technology Challenge (DSTC-11), which focuses on “Task-oriented Conversational Modeling with Subjective Knowledge”. We focus on response generation and leverage a ranking strategy to ensemble individual models of BART, Long-T5, and a fine-tuned large language model based on LLaMA. The strategy is supplemented by other techniques like low rank adaptation to maintain efficient utilization of these large models while still achieving optimal performance. The experiments show that the ensemble method outperforms individual models and the baseline method. Our model was ranked 1st place in ROUGE_1, 2nd place in ROUGE_L score and 4th place in human evaluation among a total of 14 participating teams.
Search