Cola DLM: Continuous Latent Diffusion Language Model with Competitive Scaling
Cola DLM proposes an alternative to autoregressive text generation through hierarchical information decomposition: a VAE maps text to continuous latents, a diffusion transformer models semantic patterns, and a decoder generates text conditionally. This separation of global semantic organization from local textual realization enables non-autoregressive generation while demonstrating scaling efficiency comparable to conventional autoregressive models at approximately 2B parameters.
Why it matters
49 HF Daily Papers upvotes; demonstrates competitive scaling for non-autoregressive latent diffusion text generation, strengthening the case for diffusion-based LLM alternatives to sequential token prediction.
Importance: 2/5
49 HF Daily Papers upvotes; competitive scaling behavior at ~2B parameters for continuous latent diffusion text generation.