Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

πŸ“ Overview

Matrix-Game-3.0 is an open-sourced, memory-augmented interactive world model designed for 720p real-time long-form video generation.

Framework Overview

Our framework unifies three stages into an end-to-end pipeline:

  • Data Engine β€” an industrial-scale infinite data engine integrating Unreal Engine synthetic scenes, large-scale automated AAA game collection,and real-world video augmentation to produce high-quality Video-Pose-Action-Prompt quadruplets at scale;
  • Model Training β€” a memory-augmented Diffusion Transformer (DiT) with an error buffer that learns action-conditioned generation with memory-enhanced long-horizon consistency;
  • Inference Deployment β€” few-step sampling, INT8 quantization, and model distillation achieving 720p@40FPS real-time generation with a 5B model.

Model Overview

✨ Key Features

  • πŸš€ Feature 1: Upgraded Data Engine: Combines Unreal Engine-based synthetic data, large-scale automated AAA game data, and real-world video augmentation to generate high-quality Video–Pose–Action–Prompt data.
  • πŸ–±οΈ Feature 2: Long-horizon Memory & Consistency: Uses prediction residuals and frame re-injection for self-correction, while camera-aware memory ensures long-term spatiotemporal consistency.
  • 🎬 Feature 3: Real-Time Interactivity & Open Access: It employs a multi-segment autoregressive distillation strategy based on Distribution Matching Distillation (DMD), combined with model quantization and VAE decoder distillation to support [40fps] real-time generation at 720p resolution with a 5B model, while maintaining stable memory consistency over minute-long sequence.
  • πŸ‘ Feature 3: Scale Up 28B-MoE Model: Scaling up to a 2Γ—14B model further improves generation quality, dynamics, and generalization.

πŸ”₯ Latest Updates

  • [2026-03] πŸŽ‰ Initial release of Matrix-Game-3.0 Model

πŸš€ Quick Start

Installation

Create a conda environment and install dependencies:

conda create -n matrix-game-3.0 python=3.12 -y
conda activate matrix-game-3.0
# install FlashAttention
# Our project also depends on [FlashAttention](https://github.com/Dao-AILab/flash-attention)
git clone https://github.com/SkyworkAI/Matrix-Game-3.0.git
cd Matrix-Game-3.0
pip install -r requirements.txt

Model Download

pip install "huggingface_hub[cli]"
huggingface-cli download Matrix-Game-3.0 --local-dir Matrix-Game-3.0

Inference

Before running inference, you need to prepare:

  • Input image
  • Text prompt

After downloading pretrained models, you can use the following command to generate an interactive video with random actions:

torchrun --nproc_per_node=$NUM_GPUS generate.py --size 704*1280 --dit_fsdp --t5_fsdp --ckpt_dir Matrix-Game-3.0 --fa_version 3 --use_int8 --num_iterations 12 --num_inference_steps 3 --image demo_images/000/image.png --prompt "a vintage gas station with a classic car parked under a canopy, set against a desert landscape." --save_name test --seed 42 --compile_vae --lightvae_pruning_rate 0.5 --vae_type mg_lightvae --output_dir ./output
# "num_iterations" refers to the number of iterations you want to generate. The total number of frames generated is given by:57 + (num_iterations - 1) * 40 

Tips: If you want to use the base model, you can use "--use_base_model --num_inference_steps 50". Otherwise if you want to generating the interactive videos with your own input actions, you can use "--interactive". With multiple GPUs, you can pass --use_async_vae --async_vae_warmup_iters 1 to speed up inference.

⭐ Acknowledgements

πŸ“– Citation

If you find this work useful for your research, please kindly cite our paper:

  @misc{2026matrix,
    title={Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory},
    author={{Skywork AI Matrix-Game Team}},
    year={2026},
    howpublished={Technical report},
    url={https://github.com/SkyworkAI/Matrix-Game/blob/main/Matrix-Game-3/assets/pdf/report.pdf}
  }
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Skywork/Matrix-Game-3.0

Finetuned
(20)
this model