Paper page - WALL-WM: Carving World Action Modeling at the Event Joints
…The event mode consumes next-event descriptions and enables variable-length execution chunks, while the unified mode uses a VLM with Staircase Decoding to condition conventional fixed-length chunk inference while preserving…