LongLive-2.0: NVFP4 Parallel Infrastructure for Long Video Generation (NVIDIA, 1,220 HF upvotes)

NVIDIA

Research official 2 src. ~1 min

NVIDIA introduces LongLive-2.0, an NVFP4-based (4-bit floating point) parallel infrastructure for long video generation. Key innovations: Balanced Sequence Parallelism for autoregressive training, elimination of ODE initialization dependencies, and W4A4 NVFP4 inference with quantized KV cache and asynchronous streaming VAE decoding. Achieves 2.15× training speedup and 1.84× inference speedup, reaching 45.7 FPS on the 5B model. Code and models are publicly released.

Why it matters

Received 1,220 upvotes on HuggingFace — the top daily paper. NVIDIA's production-grade infrastructure for long video generation directly tackles the memory and compute wall blocking autoregressive video model scaling. The NVFP4 precision path previews what Blackwell-era video generation looks like at scale.

Importance: 4/5

1,220 HF upvotes (top daily paper); NVIDIA production-grade NVFP4 infra achieving 2.15x training + 1.84x inference speedup for long video generation — Blackwell-era infrastructure reference

Sources