Shaona Ghosh
2024
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
|
Traian Rebedea
|
Shaona Ghosh
|
Jiaqi Zeng
|
Christopher Parisien
Findings of the Association for Computational Linguistics: EMNLP 2024
Recent advancements in instruction-tuning datasets have predominantly focused on specific tasks like mathematical or logical reasoning. There has been a notable gap in data designed for aligning language models to maintain topic relevance in conversations - a critical aspect for deploying chatbots to production. We introduce the CantTalkAboutThis dataset to help language models remain focused on the subject at hand during task-oriented interactions. It consists of synthetic dialogues on a wide range of conversation topics from different domains. These dialogues are interspersed with distractor turns that intentionally divert the chatbot from the predefined topic. Fine-tuning language models on this dataset helps make them resilient to deviating from the assigned role and improves their ability to maintain topical coherence compared to general-purpose instruction-tuned LLMs like gpt-4-turbo and Mixtral-Instruct. Additionally, preliminary observations suggest that training models on this dataset also enhance their performance on fine-grained instruction following tasks, including safety alignment.
2017
Neuramanteau: A Neural Network Ensemble Model for Lexical Blends
Kollol Das
|
Shaona Ghosh
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
The problem of blend formation in generative linguistics is interesting in the context of neologism, their quick adoption in modern life and the creative generative process guiding their formation. Blend quality depends on multitude of factors with high degrees of uncertainty. In this work, we investigate if the modern neural network models can sufficiently capture and recognize the creative blend composition process. We propose recurrent neural network sequence-to-sequence models, that are evaluated on multiple blend datasets available in the literature. We propose an ensemble neural and hybrid model that outperforms most of the baselines and heuristic models upon evaluation on test data.