8000 GitHub - haoyi-duan/WorldScore: Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

haoyi-duan/WorldScore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WorldScore: A Unified Evaluation Benchmark for World Generation

Haoyi Duan*, Hong-Xing "Koven" Yu*, Sirui Chen, Li Fei-Fei, Jiajun Wu ("*" denotes equal contribution)

Table of Contents

πŸ”₯ Updates

  • [04/2025] Paper released .
  • [04/2025] Code released.
  • [03/2025] Leaderboard released .
  • [03/2025] Dataset released .

πŸ“£ Overview

overview.mp4

Here we showcase how the WorldScore-Static metric measures two models given an initial scene of a bedroom with a specified camera pathβ€”"pan left" β†’ "move left" β†’ "pull out". While existing benchmarks rate Models A and B similarly based on single-scene video quality, our WorldScore benchmark differentiates their world generation capabilities by identifying that Model B fails to generate a new scene or follow the instructed camera movement.

πŸš€ Setup Instructions

1. Clone the repository

git clone https://github.com/haoyi-duan/WorldScore.git
cd WorldScore

2. Configure Environment Paths

Before running, you need to set up environment paths by creating a .env file in the root of this repository. This file should contain the following variables: WORLDSCORE_PATH is the root path where this repo was cloned. MODEL_PATH is the root path of the model repo (e.g., MODEL_PATH/CogVideo) as well as where the evaluation outputs will be saved (e.g., MODEL_PATH/CogVideo/worldscore_output). Finally, DATA_PATH is the path to where the WorldScore dataset will be stored (DATA_PATH/WorldScore-Dataset).

WORLDSCORE_PATH=/path/to/worldscore
MODEL_PATH=/path/to/model
DATA_PATH=/path/to/dataset

3. Export the Environment Variables

After creating the .env file, make sure to export the variables so that they are accessible throughout the workflow.

Note

This step must be repeated in every new terminal session.

export $(grep -v '^#' .env | xargs)

4. API Access (Optional)

If you plan to run models that require API access (e.g., OpenAI), create a .secrets file in the root directory and include the required API keys.

🌍 World Generation

This section guides you through setting up your environment, downloading the dataset, and generating videos for evaluation using your own world generation models.

1. Environment Setup

First, create and activate the environment for your world generation model, then install required dependencies:

# Create the environment (example command)
conda create -n world_gen python=3.10
...
# Activate the environment
conda activate world_gen
# Install worldscore dependencies
pip install -e .

2. Dataset Download

Download the WorldScore-Dataset to the specified directory DATA_PATH.

python download.py

This will automatically download and organize the dataset into:

$DATA_PATH/WorldScore-Dataset

Ensure that your .env file has correctly defined DATA_PATH and you've exported the environment variables as explained in the Setup section.

3. Generating Videos for Evaluation

  • Register your model

    Create a configuration file named model_name.yaml in the config/model_configs directory:

    model: <model_name>
    
    runs_root: ${oc.env:MODEL_PATH}/<model_name_repo>
    
    resolution: [<W>, <H>]
    generate_type: i2v # or t2v
    
    frames: <frames> # Total number of frames per generation
    fps: <fps> # Frames per second
  • Add model to modeltype.py

    Register your model in the file worldscore/benchmark/utils/modeltype.py within corresponding model type. We support the following model types:

    • "threedgen": 3D scene generation models
    • "fourdgen": 4D scene generation models
    • "videogen": video generation models

    Example:

    type2model = {
        "threedgen": [
            "wonderjourney",
            ...
        ],
        "fourdgen": [
            "4dfy",
          	...
        ],
        "videogen": [
            "cogvideox_5b_i2v",
            "model_name",
            ...
        ]
    }
  • Implement your model

    There are two ways of adapting your model to support WorldScore generation. One way is to create a model class model_name.py in world_generators to support world generation:

    class model_name:
      def __init__(
        self,
        model_name: str,
        generation_type: Literal["t2v", "i2v"],
        **kwargs
      ):
        # Initialize your model
        self.generate = ...
        
      def generate_video(
      	self,
        prompt: str,
        image_path: Optional[str] = None,
      ):
        # Generate frames
        frames = generate(prompt, image_path)
        
        # Must return either: 
        # - List[Image.Image], or 
        # - torch.Tensor of shape [N, 3, H, W] with values in [0, 1]
        return frames

    Store your model's keyword arguments model_name.yaml in the world_generators/configs directory:

    _target_: world_generators.<model_name>.<model_name>
    model_name:  <model_name>
    generation_type: i2v # or t2v
    # Add any other model-specific keyword arguments here
    **kwargs

    This way only supports video generation models for now, refer to Model Families for more examples. For 3D scene generation models and 4D scene generation models that are more complicated to adapt, also refer to Adaptation for more details.

  • Run Generation

    Single-GPU:

    python world_generators/generate_videos.py --model-name <model_name>

    Multi-GPU with Slurm:

    python world_generators/generate_videos.py \
    	--model_name <model_name> \
    	--use_slurm True \
    	--num_jobs <num_gpu> \
    	--slurm_partition <your_partition> \
    	...

    Here is an overview of output format.

Tip

For more information on distributed job launching, refer to submitit.

βœ… Evaluation

Run the following command to check the completeness of world generation:

worldscore-analysis -cd --model_name <model_name>

If incomplete, run generation first. If completed, now you can run the WorldScore evaluation to assess your model performance.

1. Environment Setup

  • Create a new conda environment
    # Tested on cuda 12.1 version
    export CUDA_HOME=/path/to/cuda-12.1/
    
    conda create -n worldscore python=3.10 && conda activate worldscore
  • Install following key dependencies
    • Droid-SLAM (About 10 Mins)
      conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
      pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.1+cu121.html
      pip install --index-url https://download.pytorch.org/whl/cu121 xformers
      conda install suitesparse -c conda-forge
      pip install open3d tensorboard scipy opencv-python tqdm matplotlib pyyaml
      pip install evo --upgrade --no-binary evo
      pip install gdown
      
      git submodule update --init thirdparty/DROID-SLAM
      cd thirdparty/DROID-SLAM/
      python setup.py install
      cd ../..
    • Other dependencies
      pip install yacs loguru einops timm imageio spacy catalogue pyiqa torchmetrics pytorch_lightning cvxpy
      python -m spacy download en_core_web_sm
    • Grounding-SAM
      git submodule update --init thirdparty/Grounded-Segment-Anything
      cd thirdparty/Grounded-Segment-Anything/
      export AM_I_DOCKER=False
      export BUILD_WITH_CUDA=True
      python -m pip install -e segment_anything
      pip install --no-build-isolation -e GroundingDINO
      cd ../..
    • SAM2
      git submodule update --init thirdparty/sam2
      cd thirdparty/sam2/
      pip install -e .
      cd ../..
    • VFIMamba
      pip install causal_conv1d==1.5.0.post8 mamba_ssm==2.2.4
    • Install WorldScore dependencies
      pip install .

2. Download Evaluation Checkpoints

Run the following commands to download all the required model checkpoints:

wget -q -P ./worldscore/benchmark/metrics/checkpoints https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.dropboxusercontent.com/s/4j4z58wuv8o0mfz/models.zip
unzip ./worldscore/benchmark/metrics/checkpoints/models.zip -d ./worldscore/benchmark/metrics/checkpoints/

wget -q -P ./worldscore/benchmark/metrics/checkpoints https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt
wget -q -P ./worldscore/benchmark/metrics/checkpoints https://huggingface.co/facebook/sam2.1-hiera-base-plus/resolve/main/sam2.1_hiera_base_plus.pt

wget -q -P ./worldscore/benchmark/metrics/checkpoints https://huggingface.co/MCG-NJU/VFIMamba_ckpts/resolve/main/ckpt/VFIMamba.pkl

3. Run WorldScore Evaluation

WorldScore evaluates the generated videos using various spatial and temporal metrics. The evaluation pipeline supports both single-GPU and multi-GPU (via Slurm) setups.

Single-GPU:

python worldscore/run_evaluate.py --model_name <model_name>

Multi-GPU with Slurm:

python worldscore/run_evaluate.py \
	--model_name <model_name> \
	--use_slurm True \
	--num_jobs <num_gpu> \
	--slurm_partition <your_partition> \
	...

Tip

For more information on distributed job launching, refer to submitit.

After evaluation is completed, the results will be saved at worldscore_output/worldscore.json.

πŸ† Leaderboard

See most updated ranking and numerical results at our Leaderboard πŸ₯‡πŸ₯ˆπŸ₯‰ . There are 2 options to join WorldScore Leaderboard:

Sampled by Evaluated by Comments
Your team Your team Highly recommended, you can follow instructions from Setup Instructions, World Generation, and Evaluation, and submit the evaluation result worldscore_output/worldscore.json to the using the Submit here! Form. The evaluation results will be automatically updated to the leaderboard.
Your team WorldScore You can also submit your video samples to us for evaluation, but the progress depends on our available time and resources.

Note

If you choose to join the leaderboard using the first option (submitting evaluation results), make sure to run:

worldscore-analysis -cs --model_name <model_name>

to verify the score completeness before submitting.

πŸ“ˆ World Generation Models Info

Model Type Model Name Ability Version Resolution Video Length(s) FPS Frame Number
Video Gen-3 I2V 2024.07.01 1280x768 10 24 253
Video Hailuo I2V 2024.08.31 1072x720 5.6 25 141
Video DynamiCrafter I2V 2023.10.18 1024x576 5 10 50
Video VideoCrafter1-T2V T2V 2023.10.30 1024x576 2 8 16
Video VideoCrafter1-I2V I2V 2023.10.30 512x320 2 8 16
Video VideoCrafter2 T2V 2024.01.17 512x320 2 8 16
Video T2V-Turbo T2V 2024.05.29 512x320 3 16 48
Video EasyAnimate I2V 2024.05.29 1344x768 6 8 49
Video CogVideoX-T2V T2V 2024.08.12 720x480 6 8 49
Video CogVideoX-I2V I2V 2024.08.12 720x480 6 8 49
Video Allegro I2V 2024.10.20 1280x720 6 15 88
Video Vchitect-2.0 T2V 2025.01.14 768x432 5 8 40
3D SceneScape T2V 2023.02.02 512x512 5 10 50
3D Text2Room I2V 2023.03.21 512x512 5 10 50
3D LucidDreamer I2V 2023.11.22 512x512 5 10 50
3D WonderJourney I2V 2023.12.06 512x512 5 10 50
3D InvisibleStitch I2V 2024.04.30 512x512 5 10 50
3D WonderWorld I2V 2024.06.13 512x512 5 10 50
4D 4D-fy T2V 2023.11.29 256x256 4 30 120

βœ’οΈ Citation

@article{duan2025worldscore,
  title={WorldScore: A Unified Evaluation Benchmark for World Generation},
  author={Duan, Haoyi and Yu, Hong-Xing and Chen, Sirui and Fei-Fei, Li and Wu, Jiajun},
  journal={arXiv preprint arXiv:2504.00983},
  year={2025}
}

About

Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0