ByteDance Unveils Seedance 2.5: Native 30-Second 4K AI Video with 50 Multimodal Inputs

ByteDance

Video official + media 4 src. ~1 min

ByteDance announced Seedance 2.5 at its Volcano Engine FORCE conference on June 23, generating single 30-second clips natively at 4K with 10-bit color depth. The model accepts up to 50 simultaneous multimodal inputs (images, audio, 3D white models, style references) and co-processes audio in the same latent space as video for native sound synchronization. An enterprise beta is live; public launch is targeted for early July.

Why it matters

Seedance 2.5 more than quadruples the reference input capacity of its nearest competitor, and native 30-second generation without stitching removes a key limitation of current video models — raising the bar for long-form AI video generation.

Importance: 4/5

Native 30-second 4K video with 50-input multimodal references — qualitative leap over current 5–10 second norms from all major competitors

text-to-video image-to-video video-generation bytedance chinese-lab 4k release

Sources

official ByteDance Seed — Official Site

media ByteDance unveils Seedance 2.5, a 30-second native 4K AI video model — The Next Web

media ByteDance's Seedance 2.5 breaks the 30-second barrier for AI video generation — The Decoder

media ByteDance Seedance 2.5: Native 30-Second AI Video, No Stitching Required — TechTimes