Haoyi Duan*, Hong-Xing "Koven" Yu*, Sirui Chen, Li Fei-Fei, Jiajun Wu ("*" denotes equal contribution)
- Updates
- Overview
- Setup Instructions
- World Generation
- Evaluation
- Leaderboard
- World Generation Models Info
- Citation
- [04/2025] Paper released
.
- [04/2025] Code released.
- [03/2025] Leaderboard released
.
- [03/2025] Dataset released
.
overview.mp4
Here we showcase how the WorldScore-Static metric measures two models given an initial scene of a bedroom with a specified camera pathβ"pan left" β "move left" β "pull out". While existing benchmarks rate Models A and B similarly based on single-scene video quality, our WorldScore benchmark differentiates their world generation capabilities by identifying that Model B fails to generate a new scene or follow the instructed camera movement.
git clone https://github.com/haoyi-duan/WorldScore.git
cd WorldScore
Before running, you need to set up environment paths by creating a .env
file in the root of this repository. This file should contain the following variables: WORLDSCORE_PATH
is the root path where this repo was cloned.
MODEL_PATH
is the root path of the model repo (e.g., MODEL_PATH/CogVideo
) as well as where the evaluation outputs will be saved (e.g., MODEL_PATH/CogVideo/worldscore_output
). Finally, DATA_PATH
is the path to where the WorldScore dataset will be stored (DATA_PATH/WorldScore-Dataset
).
WORLDSCORE_PATH=/path/to/worldscore
MODEL_PATH=/path/to/model
DATA_PATH=/path/to/dataset
After creating the .env
file, make sure to export the variables so that they are accessible throughout the workflow.
Note
This step must be repeated in every new terminal session.
export $(grep -v '^#' .env | xargs)
If you plan to run models that require API access (e.g., OpenAI), create a .secrets
file in the root directory and include the required API keys.
This section guides you through setting up your environment, downloading the dataset, and generating videos for evaluation using your own world generation models.
First, create and activate the environment for your world generation model, then install required dependencies:
# Create the environment (example command)
conda create -n world_gen python=3.10
...
# Activate the environment
conda activate world_gen
# Install worldscore dependencies
pip install -e .
Download the WorldScore-Dataset to the specified directory DATA_PATH
.
python download.py
This will automatically download and organize the dataset into:
$DATA_PATH/WorldScore-Dataset
Ensure that your .env
file has correctly defined DATA_PATH
and you've exported the environment variables as explained in the Setup section.
-
Create a configuration file named
model_name.yaml
in the config/model_configs directory:model: <model_name> runs_root: ${oc.env:MODEL_PATH}/<model_name_repo> resolution: [<W>, <H>] generate_type: i2v # or t2v frames: <frames> # Total number of frames per generation fps: <fps> # Frames per second
-
Register your model in the file worldscore/benchmark/utils/modeltype.py within corresponding model type. We support the following model types:
"threedgen"
: 3D scene generation models"fourdgen"
: 4D scene generation models"videogen"
: video generation models
Example:
type2model = { "threedgen": [ "wonderjourney", ... ], "fourdgen": [ "4dfy", ... ], "videogen": [ "cogvideox_5b_i2v", "model_name", ... ] }
-
There are two ways of adapting your model to support WorldScore generation. One way is to create a model class
model_name.py
in world_generators to support world generation:class model_name: def __init__( self, model_name: str, generation_type: Literal["t2v", "i2v"], **kwargs ): # Initialize your model self.generate = ... def generate_video( self, prompt: str, image_path: Optional[str] = None, ): # Generate frames frames = generate(prompt, image_path) # Must return either: # - List[Image.Image], or # - torch.Tensor of shape [N, 3, H, W] with values in [0, 1] return frames
Store your model's keyword arguments
model_name.yaml
in the world_generators/configs directory:_target_: world_generators.<model_name>.<model_name> model_name: <model_name> generation_type: i2v # or t2v # Add any other model-specific keyword arguments here **kwargs
This way only supports video generation models for now, refer to Model Families for more examples. For 3D scene generation models and 4D scene generation models that are more complicated to adapt, also refer to Adaptation for more details.
-
Single-GPU:
python world_generators/generate_videos.py --model-name <model_name>
Multi-GPU with Slurm:
python world_generators/generate_videos.py \ --model_name <model_name> \ --use_slurm True \ --num_jobs <num_gpu> \ --slurm_partition <your_partition> \ ...
Here is an overview of output format.
Tip
For more information on distributed job launching, refer to submitit.
Run the following command to check the completeness of world generation:
worldscore-analysis -cd --model_name <model_name>
If incomplete, run generation first. If completed, now you can run the WorldScore evaluation to assess your model performance.
-
# Tested on cuda 12.1 version export CUDA_HOME=/path/to/cuda-12.1/ conda create -n worldscore python=3.10 && conda activate worldscore
-
-
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.1+cu121.html pip install --index-url https://download.pytorch.org/whl/cu121 xformers conda install suitesparse -c conda-forge pip install open3d tensorboard scipy opencv-python tqdm matplotlib pyyaml pip install evo --upgrade --no-binary evo pip install gdown git submodule update --init thirdparty/DROID-SLAM cd thirdparty/DROID-SLAM/ python setup.py install cd ../..
-
pip install yacs loguru einops timm imageio spacy catalogue pyiqa torchmetrics pytorch_lightning cvxpy python -m spacy download en_core_web_sm
-
git submodule update --init thirdparty/Grounded-Segment-Anything cd thirdparty/Grounded-Segment-Anything/ export AM_I_DOCKER=False export BUILD_WITH_CUDA=True python -m pip install -e segment_anything pip install --no-build-isolation -e GroundingDINO cd ../..
-
git submodule update --init thirdparty/sam2 cd thirdparty/sam2/ pip install -e . cd ../..
-
pip install causal_conv1d==1.5.0.post8 mamba_ssm==2.2.4
-
pip install .
-
Run the following commands to download all the required model checkpoints:
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.dropboxusercontent.com/s/4j4z58wuv8o0mfz/models.zip
unzip ./worldscore/benchmark/metrics/checkpoints/models.zip -d ./worldscore/benchmark/metrics/checkpoints/
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://huggingface.co/facebook/sam2.1-hiera-base-plus/resolve/main/sam2.1_hiera_base_plus.pt
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://huggingface.co/MCG-NJU/VFIMamba_ckpts/resolve/main/ckpt/VFIMamba.pkl
WorldScore evaluates the generated videos using various spatial and temporal metrics. The evaluation pipeline supports both single-GPU and multi-GPU (via Slurm) setups.
Single-GPU:
python worldscore/run_evaluate.py --model_name <model_name>
Multi-GPU with Slurm:
python worldscore/run_evaluate.py \
--model_name <model_name> \
--use_slurm True \
--num_jobs <num_gpu> \
--slurm_partition <your_partition> \
...
Tip
For more information on distributed job launching, refer to submitit.
After evaluation is completed, the results will be saved at worldscore_output/worldscore.json
.
See most updated ranking and numerical results at our Leaderboard π₯π₯π₯ . There are 2 options to join WorldScore Leaderboard:
Sampled by | Evaluated by | Comments |
---|---|---|
Your team | Your team | Highly recommended, you can follow instructions from Setup Instructions, World Generation, and Evaluation, and submit the evaluation result worldscore_output/worldscore.json to the Submit here! Form. The evaluation results will be automatically updated to the leaderboard. |
Your team | WorldScore | You can also submit your video samples to us for evaluation, but the progress depends on our available time and resources. |
Note
If you choose to join the leaderboard using the first option (submitting evaluation results), make sure to run:
worldscore-analysis -cs --model_name <model_name>
to verify the score completeness before submitting.
Model Type | Model Name | Ability | Version | Resolution | Video Length(s) | FPS | Frame Number |
---|---|---|---|---|---|---|---|
Video | Gen-3 | I2V | 2024.07.01 | 1280x768 | 10 | 24 | 253 |
Video | Hailuo | I2V | 2024.08.31 | 1072x720 | 5.6 | 25 | 141 |
Video | DynamiCrafter | I2V | 2023.10.18 | 1024x576 | 5 | 10 | 50 |
Video | VideoCrafter1-T2V | T2V | 2023.10.30 | 1024x576 | 2 | 8 | 16 |
Video | VideoCrafter1-I2V | I2V | 2023.10.30 | 512x320 | 2 | 8 | 16 |
Video | VideoCrafter2 | T2V | 2024.01.17 | 512x320 | 2 | 8 | 16 |
Video | T2V-Turbo | T2V | 2024.05.29 | 512x320 | 3 | 16 | 48 |
Video | EasyAnimate | I2V | 2024.05.29 | 1344x768 | 6 | 8 | 49 |
Video | CogVideoX-T2V | T2V | 2024.08.12 | 720x480 | 6 | 8 | 49 |
Video | CogVideoX-I2V | I2V | 2024.08.12 | 720x480 | 6 | 8 | 49 |
Video | Allegro | I2V | 2024.10.20 | 1280x720 | 6 | 15 | 88 |
Video | Vchitect-2.0 | T2V | 2025.01.14 | 768x432 | 5 | 8 | 40 |
3D | SceneScape | T2V | 2023.02.02 | 512x512 | 5 | 10 | 50 |
3D | Text2Room | I2V | 2023.03.21 | 512x512 | 5 | 10 | 50 |
3D | LucidDreamer | I2V | 2023.11.22 | 512x512 | 5 | 10 | 50 |
3D | WonderJourney | I2V | 2023.12.06 | 512x512 | 5 | 10 | 50 |
3D | InvisibleStitch | I2V | 2024.04.30 | 512x512 | 5 | 10 | 50 |
3D | WonderWorld | I2V | 2024.06.13 | 512x512 | 5 | 10 | 50 |
4D | 4D-fy | T2V | 2023.11.29 | 256x256 | 4 | 30 | 120 |
@article{duan2025worldscore,
title={WorldScore: A Unified Evaluation Benchmark for World Generation},
author={Duan, Haoyi and Yu, Hong-Xing and Chen, Sirui and Fei-Fei, Li and Wu, Jiajun},
journal={arXiv preprint arXiv:2504.00983},
year={2025}
}