Code2Worlds

Empowering Coding LLMs for 4D World Generation

Yi Zhang*   Yunshuang Wang*   Zeyu Zhang*†   Hao Tang‡
School of Computer Science, Peking University
*Equal contribution.   †Project lead.   †Corresponding author.

Visualization

A curated gallery of Code2Worlds renders showcasing the geometry, lighting, and imagination that the model unlocks in every scene.

A breeze stirs through the autumn forest, gently swaying the entire tree as leaves dance in the wind.

A 10-second time-lapse capturing the transition from warm pre-dawn to sunrise, followed by midday sunlight through a green canopy, golden late afternoon light, and ending with a moonlit, misty night.

The midday sun illuminated tall trees as withered, yellowed leaves with brown spots fell silently, spinning before settling on a bright pile of fallen leaves.

A dense, lush green forest in a heavy rainstorm, with diagonal rain streaks and tall trees swaying gently in the wind and rain.

A peaceful desert afternoon with a gentle breeze, as sand flows like liquid silk down the sharp ridgeline of a dune.

A vibrant underwater scene with a translucent jellyfish glowing in soft blue and purple bioluminescence, its bell pulsating rhythmically as it drifts gracefully through the water.

A weathered chopped tree burns fiercely in the dark forest, with bright yellow flames rising from the center and glowing red embers on the edges. The flickering light casts dynamic shadows on the gnarled roots spreading into the dirt.

A tall, thick-walled ceramic cup with a wide, curved handle lies tipped on its side on the living room coffee table, spilling water from its wide rim and flooding the tabletop as it flows toward the edge and drips down.

A brown glass bottle rolled slowly across the sunlit living room floor.

A cozy bedroom with warm lighting, where a classic ceramic coffee cup sits on the desk, its smooth surface reflecting the soft glow as steam rises gently from the cup, disappearing into the light.

Method

We introduce Code2Worlds, a framework that formulates 4D generation as language-to-simulation code generation.

Model Pipeline

Code2Worlds Execution Pipeline. The framework generates 4D scenes via a dual-stream architecture: 1) an Object Stream utilizing retrieval augmented parameter generation with object self-reflection; 2) a Scene Stream employing hierarchical environmental orchestration; and 3) a refinement mechanism driven by a PostProcess Agent and self-reflection.

Model Pipeline

A detailed workflow for generating a 4D scene, integrating environmental scene, object generation, and feedback-driven refinement to ensure realistic scene rendering.

Code2Worlds: Empowering Coding LLMs for 4D World Generation