WorldDirector: Controllable World Simulator with Persistent Dynamic Object Memory
WorldDirector decouples motion planning from video rendering: an LLM coordinates 3D object trajectories and camera movements, which then drive a video generation model. The result is dynamic objects that maintain consistent visual identity even when they leave and re-enter the frame across extended sequences.
Why it matters
Most video world models lose track of object identity over time. Decoupling semantic orchestration from pixel rendering enables persistent, re-identifiable objects with free camera viewpoints — a step toward general-purpose interactive world simulators. 18 upvotes on HuggingFace Daily Papers.
Importance: 2/5
Novel architecture for persistent object identity in video world models; 18 upvotes HF Daily Papers