OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking

Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf


Abstract
Large language models (LLMs) have revolutionized the landscape of Natural Language Processing, but are computationally expensive. To reduce the cost without sacrificing performance, previous studies have explored various approaches to harness the potential of Smaller Language Models (SLMs) as cost-effective alternatives to their larger counterparts. Driven by findings that SLMs and LLMs exhibit complementary strengths in a structured knowledge extraction task, this work presents a novel SLM/LLM routing framework designed to improve computational efficiency and enhance task performance. In dialogue state tracking tasks, the proposed routing framework enhances performance substantially compared to relying solely on LLMs, while reducing the computational costs by over 50%.
Anthology ID:
2024.naacl-long.79
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1434–1445
Language:
URL:
https://aclanthology.org/2024.naacl-long.79
DOI:
10.18653/v1/2024.naacl-long.79
Bibkey:
Cite (ACL):
Chia-Hsuan Lee, Hao Cheng, and Mari Ostendorf. 2024. OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1434–1445, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking (Lee et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.79.pdf