ByteDance Previews Seedance 2.5: Native 4K, 30-Second Video with 50 Reference Inputs

ByteDance

Video official + media 3 src. ~1 min

Also at the June 23 Volcano Engine FORCE conference, ByteDance previewed Seedance 2.5, its next-generation video model. The model generates native 30-second single-clip video at 4K resolution with 10-bit color depth, and accepts up to 50 multimodal reference inputs simultaneously — images, audio, 3D models, style references — compared to 12 in the previous version. Post-generation local editing preserves visual style. The model is in global enterprise beta; public launch is targeted for early July 2026.

Why it matters

Extending single-pass video generation to 30 seconds at 4K clears a key production barrier that most current models cannot meet without stitching artifacts. The 50-reference multimodal input capacity targets professional film and advertising pipelines, directly challenging Runway and Kling at the high end.

Importance: 3/5

Native 30-second 4K video in a single pass; 50 multimodal reference inputs set new capability bar for professional video workflows

Sources