Bin Yu
Other people with similar names: Bin Yu
Unverified author pages with similar names: Bin Yu
2026
Formally Specifying the Intended Behavior of the Program: LLM-Driven Neuro-Symbolic Program Specification Synthesis
Cheng Wen | Hu Junjie | YiKun Hu | Jie Su | Bin Yu | Dugang Liu | Zhiwu Xu | Weidi Sun | Shengchao Qin | Cong Tian
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Cheng Wen | Hu Junjie | YiKun Hu | Jie Su | Bin Yu | Dugang Liu | Zhiwu Xu | Weidi Sun | Shengchao Qin | Cong Tian
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Formal verification can provide strong mathematical guarantees about software correctness, but it typically requires developers to write detailed formal specifications (e.g., contracts and loop invariants), which is costly and error-prone. We introduce AutoSpec+, an LLM-driven neuro-symbolic demonstration system that reframes specification writing as constrained structured synthesis: large language models generate candidate specifications at the granularity of proof-relevant program components, while a symbolic verifier acts as a deterministic critic that checks legality, satisfiability, and proof adequacy, rejecting or repairing candidates in an iterative loop. This design turns unconstrained text generation into constrained structured synthesis, substantially reducing hallucinations and producing proof-ready annotations. We evaluate AutoSpec+ on seven benchmark suites, showing strong effectiveness. We release an open-source, Dockerized system with ensemble LLM backends and inter-modular verification support for reproducible demonstration and deployment
Bridging Kernel Drivers and Virtual Device Models with LLM-Powered Automation
Mingyu Wang | Bin Yu | Wenjian Lu | Zhi Wang | Gao Kefeng | Cheng Wen | Xu Lu | Cong Tian
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Mingyu Wang | Bin Yu | Wenjian Lu | Zhi Wang | Gao Kefeng | Cheng Wen | Xu Lu | Cong Tian
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Linux kernel device drivers are tightly coupled with hardware, making them difficult to execute and test without physical devices. This heavily limits automated code analysis and vulnerability discovery. While manual modeling is unscalable, Large Language Models (LLMs) offer a new approach to scale virtual device construction across the Linux driver ecosystem. In this paper, we present DevGen, an LLM-powered tool that generates QEMU-based virtual devices directly from Linux driver source code. DevGen combines static analysis to gather necessary context, guides the LLM through step-by-step prompting, and uses an automated self-correction loop driven by compilation and execution feedback. To further reduce errors, similar fixes are retrieved from a library of common modeling failures and incorporated into the repair prompt, which supports more targeted corrections in later iterations. The generated devices finally integrate with QEMU and Syzkaller, enabling driver fuzzing without physical hardware. DevGen is evaluated on 50 PCI/PCIe drivers from Linux 6.18 using three mainstream LLMs, and successfully generates usable models for 44 drivers. In these drivers, 24% of them achieve significant improvements in fuzzing coverage, and 7 previously unknown crashes are triggered with 1 CVE assigned. These results demonstrate the practical capability of LLMs to automate complex, system-level code generation tasks.