Cola DLM: Continuous Latent Diffusion Language Model with Competitive Scaling

Research official 2 src. ~1 min

Cola DLM proposes an alternative to autoregressive text generation through hierarchical information decomposition: a VAE maps text to continuous latents, a diffusion transformer models semantic patterns, and a decoder generates text conditionally. This separation of global semantic organization from local textual realization enables non-autoregressive generation while demonstrating scaling efficiency comparable to conventional autoregressive models at approximately 2B parameters.

Why it matters

49 HF Daily Papers upvotes; demonstrates competitive scaling for non-autoregressive latent diffusion text generation, strengthening the case for diffusion-based LLM alternatives to sequential token prediction.

Importance: 2/5

49 HF Daily Papers upvotes; competitive scaling behavior at ~2B parameters for continuous latent diffusion text generation.

Sources