World Action Models: A Survey
National University of Singapore
A comprehensive survey of World Action Models (WAMs) — embodied predictive-action models that forecast future states to inform robot control. The authors organize 109 methods across three design philosophies (Render-and-Decode, Latent-Only, Video-Generation-Free) and four architectural axes, concluding that the field is converging on generating less of the future while preserving what control requires.
Why it matters
217 upvotes on HuggingFace Daily Papers (top paper of June 23); provides the first rigorous taxonomy distinguishing true WAMs from video generators as compute-action trade-offs become central to embodied AI design.
Importance: 3/5
217 upvotes on HF Daily Papers (top paper June 23); first systematic taxonomy across 109 methods in the rapidly growing World Action Model field