%0 Conference Proceedings %T ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models %A Zhang, Yuxiang %A Chen, Jing %A Wang, Junjie %A Liu, Yaxin %A Yang, Cheng %A Shi, Chufan %A Zhu, Xinyu %A Lin, Zihao %A Wan, Hanwen %A Yang, Yujiu %A Sakai, Tetsuya %A Feng, Tian %A Yamana, Hayato %Y Al-Onaizan, Yaser %Y Bansal, Mohit %Y Chen, Yun-Nung %S Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing %D 2024 %8 November %I Association for Computational Linguistics %C Miami, Florida, USA %F zhang-etal-2024-toolbehonest %R 10.18653/v1/2024.emnlp-main.637 %U https://aclanthology.org/2024.emnlp-main.637/ %U https://doi.org/10.18653/v1/2024.emnlp-main.637 %P 11388-11422