Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults

Zhenhao Zhou; Zhuochen Huang; Yike He; Chong Wang; Jiajun Wang; Yijian Wu; Xin Peng; Yiling Lou

Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults

Zhenhao Zhou, Zhuochen Huang, Yike He, Chong Wang, Jiajun Wang, Yijian Wu, Xin Peng, Yiling Lou

Abstract

The Linux kernel is a critical system, serving as the foundation for numerous systems. Bugs in the Linux kernel can cause serious consequences, affecting billions of users. Fault localization (FL), which aims at identifying the buggy code elements in software, plays an essential role in software quality assurance. While recent LLM agents have achieved promising accuracy in FL on recent benchmarks like SWE-bench, it remains unclear how well these methods perform in the Linux kernel, where FL is much more challenging due to the large-scale code base, limited observability, and diverse impact factors. In this paper, we introduce LinuxFLBench, a FL benchmark constructed from real-world Linux kernel bugs. We conduct an empirical study to assess the performance of state-of-the-art LLM agents on the Linux kernel. Our initial results reveal that existing agents struggle with this task, achieving a best top-1 accuracy of only 41.6% at file level. To address this challenge, we propose LinuxFL+, an enhancement framework designed to improve FL effectiveness of LLM agents for the Linux kernel. LinuxFL+ substantially improves the FL accuracy of all studied agents (e.g., 7.2% - 11.2% accuracy increase) with minimal costs.

Anthology ID:: 2026.acl-long.862
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18899–18916
Language:
URL:: https://aclanthology.org/2026.acl-long.862/
DOI:
Bibkey:
Cite (ACL):: Zhenhao Zhou, Zhuochen Huang, Yike He, Chong Wang, Jiajun Wang, Yijian Wu, Xin Peng, and Yiling Lou. 2026. Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 18899–18916, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults (Zhou et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.862.pdf
Checklist:: 2026.acl-long.862.checklist.pdf

PDF Cite Search Checklist Fix data