Loading Video generation models have made rapid progress. They now produce outputs that rival cinematic footage in visual fidelity. Yet despite this progress, they have not meaningfully disrupted games. This is not a failure of scale or realism, but structure. Games are not passive; they are stateful, interactive systems governed by rules, persistence, and player agency. Most generative models were designed for producing sequences, not for operating inside live simulations. Moonlake’s Reverie is a generative model built for games, starting from first principles about how games actually function.
Games are simulations that run in real-time. If generation cannot respond at frame time, it cannot be used in playtime — input latency breaks immersion, mechanics become unresponsive, and gameplay stalls. This is a hard requirement. For example, in a real-time combat scenario, even brief generation delays cause player actions to register late, breaking core mechanics. Reverie is designed towards real-time generation such that it doesn't blocking gameplay.
Unlike video clips, games do not end after a few seconds. To achieve unbounded runtime, we need persistent representation. Reverie is trained to be able to operate across hours of gameplay, repeated interactions, and remaining stable by conditioning on persistence representations constructed by our our multimodal reasoning models.
This persistence allows Reverie to reskin the game into different styles real-time while maintaing object identities:
Fun are defined by mechanics, not just visuals. AI built for games should enable more mechanical control for both creators and players. We trained Reverie to be capable of exposing a programmable interface that can be bound to events, triggered by state changes, and invoked through authored logic. Creators can define behaviors such as transforming an environment when a boss enters a new phase or altering a NPC when it is interacted with:
Moonlake is not a video model adapted for games. It is a game-native diffusion model trained to operate under real-time constraints, allow for unbounded runtime, and expose generation as a programmable part of gameplay. This is the difference between generating images of worlds and generating worlds that can be played.
We are a frontier research lab building multimodal intelligence for interactive world creation.
If you share this vision, join us or get in touch at info@moonlakeai.com