Dilek Tur
2024
Unsupervised Human Preference Learning
Sumuk Shashidhar
|
Abhinav Chinta
|
Vaibhav Sahai
|
Dilek Tur
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large language models demonstrate impressive reasoning abilities but struggle to provide personalized content due to their lack of individual user preference information. Existing methods, such as in-context learning and parameter-efficient fine-tuning, fall short in capturing the complexity of human preferences, especially given the small, personal datasets individuals possess. In this paper, we propose a novel approach utilizing small parameter models as preference agents to generate natural language rules that guide a larger, pre-trained model, enabling efficient personalization. Our method involves a small, local “steering wheel” model that directs the outputs of a much larger foundation model, producing content tailored to an individual’s preferences while leveraging the extensive knowledge and capabilities of the large model. Importantly, this personalization is achieved without the need to fine-tune the large model. Experimental results on email and article datasets, demonstrate that our technique significantly outperforms baseline personalization methods. By allowing foundation models to adapt to individual preferences in a data and compute-efficient manner, our approach paves the way for highly personalized language model applications.
Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging
Priyanka Kargupta
|
Ishika Agarwal
|
Dilek Tur
|
Jiawei Han
Findings of the Association for Computational Linguistics: EMNLP 2024
Socratic questioning is an effective teaching strategy, encouraging critical thinking and problem-solving. The conversational capabilities of large language models (LLMs) show great potential for providing scalable, real-time student guidance. However, current LLMs often give away solutions directly, making them ineffective instructors. We tackle this issue in the code debugging domain with TreeInstruct, an Instructor agent guided by a novel state space-based planning algorithm. TreeInstruct asks probing questions to help students independently identify and resolve errors. It estimates a student’s conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting. In addition to using an existing single-bug debugging benchmark, we construct a more challenging multi-bug dataset of 150 coding problems, incorrect solutions, and bug fixes– all carefully constructed and annotated by experts. Extensive evaluation shows TreeInstruct’s state-of-the-art performance on both datasets, proving it to be a more effective instructor than baselines. Furthermore, a real-world case study with five students of varying skill levels further demonstrates TreeInstruct’s ability to guide students to debug their code efficiently with minimal turns and highly Socratic questioning.
Search
Co-authors
- Sumuk Shashidhar 1
- Abhinav Chinta 1
- Vaibhav Sahai 1
- Priyanka Kargupta 1
- Ishika Agarwal 1
- show all...