Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents

Sameer Komoravolu; Khalil Mrini

Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents

Abstract

LLM agents are increasingly deployed to plan, retrieve, and write with tools, yet evaluation still leans on static benchmarks and small human studies. We present the Agent-Testing Agent (ATA), a meta-agent that combines static code analysis, developer interrogation, literature mining, and persona-driven adversarial test generation whose difficulty adapts via judge feedback. Each dialogue is scored with an LLM-as-a-Judge (LAAJ) rubric and used to steer subsequent tests toward the agent’s weakest capabilities. On a travel planner and a Wikipedia writer, the ATA surfaces more diverse and severe failures than expert annotators while matching severity, and finishes in 20–30 minutes versus ten-annotator rounds that took days. Ablating code analysis and web search increases variance and miscalibration, underscoring the value of evidence-grounded test generation. The ATA outputs quantitative metrics and qualitative bug reports for developers. We release the full open-source implementation.

Anthology ID:: 2026.eacl-long.339
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7199–7214
Language:
URL:: https://aclanthology.org/2026.eacl-long.339/
DOI:
Bibkey:
Cite (ACL):: Sameer Komoravolu and Khalil Mrini. 2026. Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7199–7214, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents (Komoravolu & Mrini, EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-long.339.pdf
Checklist:: 2026.eacl-long.339.checklist.pdf

PDF Cite Search Checklist Fix data