Abstract
GDDS presents a unified framework for discrete diffusion modeling that supports arbitrary noising processes and demonstrates superior performance in large-vocabulary discrete generation tasks.
We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing discrete diffusion approaches, while allowing significantly greater flexibility in the choice of corruption dynamics. The forward noising process relies on uniformization and enables fast arbitrary corruption. For the reverse process, we derive a simple evidence lower bound (ELBO) based on snapshot latents, instead of the entire noising path, that allows efficient training of standard generative modeling architectures with clear probabilistic interpretation. Our experiments on large-vocabulary discrete generation tasks suggest that the proposed framework outperforms existing discrete diffusion methods in terms of training efficiency and generation quality, and beats autoregressive models for the first time at this scale. We provide the code along with a blog post on the project page : https://oussamazekri.fr/gdds{https://oussamazekri.fr/gdds}.
Community
GDDS is a modular framework for discrete diffusion modeling over large discrete state spaces. The idea is to make much richer discrete noising processes practical, instead of restricting diffusion to mask/uniform noise.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Unifying Masked Diffusion Models with Various Generation Orders and Beyond (2026)
- Efficient Sampling with Discrete Diffusion Models: Sharp and Adaptive Guarantees (2026)
- Causal Autoregressive Diffusion Language Model (2026)
- CRoCoDiL: Continuous and Robust Conditioned Diffusion for Language (2026)
- Discrete Diffusion with Sample-Efficient Estimators for Conditionals (2026)
- Is Your Diffusion Sampler Actually Correct? A Sampler-Centric Evaluation of Discrete Diffusion Language Models (2026)
- Generative Diffusion Model for Risk-Neutral Derivative Pricing (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper