How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities

Aly M .Kassem; Bernhard Schölkopf; Zhijing Jin

How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities

Aly M. Kassem, Bernhard Schölkopf, Zhijing Jin

Abstract

Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance by dynamically assigning queries to the most appropriate model based on query complexity. Despite recent advances showing that preference-data-based routers can outperform traditional methods, current evaluation benchmarks remain limited—they largely focus on general model capabilities while overlooking task-specific behaviors and critical concerns such as privacy, safety, and potential backdoor vulnerabilities introduced through preference data. In response, we propose the DSC benchmark: Diverse, simple, and categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types—including coding, translation, mathematics, human instructions, general knowledge, and LLM jailbreaking—and integrates privacy and safety assessments to reveal hidden risks. Our experiments on three preference-based routers and two commercial counterparts demonstrate that while these systems improve efficiency, they often make suboptimal, category-driven decisions; for instance, a BERT-based router directs all coding and mathematics queries to the most powerful LLM—even when simpler models would suffice—while routing jailbreaking attempts to weaker models, thereby elevating safety risks.

Anthology ID:: 2026.eacl-long.351
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7496–7507
Language:
URL:: https://aclanthology.org/2026.eacl-long.351/
DOI:
Bibkey:
Cite (ACL):: Aly M. Kassem, Bernhard Schölkopf, and Zhijing Jin. 2026. How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7496–7507, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities (Kassem et al., EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-long.351.pdf
Checklist:: 2026.eacl-long.351.checklist.pdf

PDF Cite Search Checklist Fix data