Dual-Reasoner: Bridging Interleaved Atomicity and Streaming Latency via Thinking-while-Talking

Yangzhuo Li; Shengpeng Ji; Yifu Chen; Tianle Liang; Haoyu Yang; Junboli; Jun Fang; Lin Li; Qingyang Hong

Dual-Reasoner: Bridging Interleaved Atomicity and Streaming Latency via Thinking-while-Talking

Yangzhuo Li, Shengpeng Ji, Yifu Chen, Tianle Liang, Haoyu Yang, Junboli, Jun Fang, Lin Li, Qingyang Hong

Abstract

Integrating explicit Chain-of-Thought (CoT) into end-to-end spoken dialogue models enhances intelligence but incurs prohibitive latency. While the "Thinking-while-Talking" paradigm alleviates this delay, it fundamentally compromises block atomicity, severing the logical connection between interleaved thought and speech. To address this, we present Dual-Reasoner, employing a Streaming Masking Mechanism underpinned by our Dual-Think-30k dataset to guarantee uninterrupted audio streaming. Crucially, to strictly align the fragmented thinking blocks to service speech generation, we introduce the Atomic-Consistency Restoration framework. To secure comprehensive capabilities in high-difficulty reasoning, this mechanism utilizes a quadruple-constraint system to reconstruct logical atomicity, ensuring that "think" chunks act as a rigorous anchor for "talk" outputs. Experimental results demonstrate that Dual-Reasoner achieves comprehensive reasoning enhancements within ultra-low latency constraints: it elevates the VoiceBench score from 67.24 to 73.41 over the baseline, while significantly reducing the Time-to-First-Audio (TTFA) from 20.35s to 3.65s and the Real-Time Factor (RTF) from 7.04 to 1.05.

Anthology ID:: 2026.findings-acl.199
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4081–4105
Language:
URL:: https://aclanthology.org/2026.findings-acl.199/
DOI:
Bibkey:
Cite (ACL):: Yangzhuo Li, Shengpeng Ji, Yifu Chen, Tianle Liang, Haoyu Yang, Junboli, Jun Fang, Lin Li, and Qingyang Hong. 2026. Dual-Reasoner: Bridging Interleaved Atomicity and Streaming Latency via Thinking-while-Talking. In Findings of the Association for Computational Linguistics: ACL 2026, pages 4081–4105, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Dual-Reasoner: Bridging Interleaved Atomicity and Streaming Latency via Thinking-while-Talking (Li et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.199.pdf
Checklist:: 2026.findings-acl.199.checklist.pdf

PDF Cite Search Checklist Fix data