Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation

Masaki Asada, Makoto Miwa


Abstract
This study addresses the discrepancy between training and inference in discrete diffusion models for text generation. We propose two novel strategies: (1) a training schema that considers two-step diffusion processes, allowing the model to use its own predicted output as input for subsequent steps during training and (2) a scheduling technique that gradually increases the probability of using self-generated text as training progresses. Experiments conducted on four widely used text generation benchmark datasets demonstrate that both proposed strategies improve the performance of discrete diffusion models in text generation.
Anthology ID:
2025.coling-main.477
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7156–7164
Language:
URL:
https://aclanthology.org/2025.coling-main.477/
DOI:
Bibkey:
Cite (ACL):
Masaki Asada and Makoto Miwa. 2025. Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7156–7164, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation (Asada & Miwa, COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.477.pdf