ByteDance Previews Seedance 2.5: Native 4K, 30-Second Video with 50 Reference Inputs
ByteDance
Also at the June 23 Volcano Engine FORCE conference, ByteDance previewed Seedance 2.5, its next-generation video model. The model generates native 30-second single-clip video at 4K resolution with 10-bit color depth, and accepts up to 50 multimodal reference inputs simultaneously — images, audio, 3D models, style references — compared to 12 in the previous version. Post-generation local editing preserves visual style. The model is in global enterprise beta; public launch is targeted for early July 2026.
Why it matters
Extending single-pass video generation to 30 seconds at 4K clears a key production barrier that most current models cannot meet without stitching artifacts. The 50-reference multimodal input capacity targets professional film and advertising pipelines, directly challenging Runway and Kling at the high end.
Importance: 3/5
Native 30-second 4K video in a single pass; 50 multimodal reference inputs set new capability bar for professional video workflows