Diffusion with Truncated Blocks: Fast and High-Quality Text Generation using Truncated Block Generation

Yuyan Zhou; Weiyu Chen; James Kwok

Diffusion with Truncated Blocks: Fast and High-Quality Text Generation using Truncated Block Generation

Abstract

Diffusion-based Large Language Models (dLLMs) are emerging as a powerful alternative to traditional autoregressive models. These models learn to generate text by iteratively denoising masked sequences. In this work, we identify a critical problem in dLLMs: the model’s attention is wastefully expended on uninformative mask tokens, diluting its focus on meaningful context. We term this phenomenon “attention dilution”. We further show that this artifact is amplified by token-level noising, whereas models employing sequence-level noise exhibit a reduced effect. To resolve this problem, we introduce Truncated Block Generation, a novel sampling algorithm that not only mitigates attention dilution but also enables faster inference and flexible-length sequence generation. Extensive experiments validate our analysis and demonstrate the marked effectiveness of our proposed method in enhancing both the performance and efficiency of dLLMs.

Anthology ID:: 2026.findings-acl.212
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4335–4348
Language:
URL:: https://aclanthology.org/2026.findings-acl.212/
DOI:
Bibkey:
Cite (ACL):: Yuyan Zhou, Weiyu Chen, and James Kwok. 2026. Diffusion with Truncated Blocks: Fast and High-Quality Text Generation using Truncated Block Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 4335–4348, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Diffusion with Truncated Blocks: Fast and High-Quality Text Generation using Truncated Block Generation (Zhou et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.212.pdf
Checklist:: 2026.findings-acl.212.checklist.pdf

PDF Cite Search Checklist Fix data