Daisuke Saito
2026
Incorporating Respect into LLM-Based Academic Feedback: A BI-R Framework for Instructing Students after Q&A Sessions
Mayuko Aiba | Daisuke Saito | Nobuaki Minematsu
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology
Mayuko Aiba | Daisuke Saito | Nobuaki Minematsu
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology
In academic research, post-presentation Q&A sessions are crucial for deepening understanding and shaping research directions. Supervisors’ comments are particularly valuable when they highlight perspectives that students have not yet fully considered. Such comments typically arise from careful reasoning within dialogue, yet large language models (LLMs) still struggle to reason precisely about dialogue context and communicative intentions. Building on LLMs, this study proposes a feedback generation framework based on the Belief–Desire–Intention (BDI) model, which conceptualizes Q&A sessions as cognitive interactions between presenters and questioners. We further extend this framework into BI-R by introducing Respect as an explicit dimension, ensuring that generated feedback is not only accurate but also pedagogically constructive. We evaluated the proposed framework (BDI and BI-R) through comparative experiments with master’s students and field experiments with doctoral students during pre-defense presentations. Results showed that while the BDI prompt did not outperform the baseline, the BI-R prompt was particularly effective when students did not fully grasp the broader context or background of the questions. When comparing BDI and BI-R, the inclusion of Respect improved the tone and pedagogical appropriateness of feedback. These findings highlight the potential of the proposed framework as a supportive tool for training students and early-career researchers.
2022
Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling
Zhuo Gong | Daisuke Saito | Sheng Li | Hisashi Kawai | Nobuaki Minematsu
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
Zhuo Gong | Daisuke Saito | Sheng Li | Hisashi Kawai | Nobuaki Minematsu
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
Language models (LM) have played crucial roles in automatic speech recognition (ASR) to enhance end-to-end (E2E) ASR systems’ performance. There are two categories of approaches: finding better ways to integrate LMs into ASR systems and adapting on LMs to the task domain. This article will start with a reflection of interpolation-based integration methods of E2E ASR’s scores and LM’s scores. Then we will focus on LM augmentation approaches based on the noisy channel model, which is intrigued by insights obtained from the above reflection. The experiments show that we can enhance an ASR E2E model based on encoder-decoder architecture by pre-training the decoder with text data. This implies the decoder of an E2E model can be treated as an LM and reveals the possibility of enhancing the E2E model without an external LM. Based on those ideas, we proposed the implicit language model canceling method and then did more discussion about the decoder part of an E2E ASR model. The experimental results on the TED-LIUM2 dataset show that our approach achieves a 3.4% relative WER reduction compared with the baseline system, and more analytic experiments provide concrete experimental supports for our assumption.