Cross-View Completion Models are
Zero-shot Correspondence Estimators

CVPR 2025 Highlight

Honggyu An^1* · Jin Hyeon Kim^2* · Seonghoon Park³ · Jaewoo Jung¹
Jisang Han¹ . Sunghwan Hong^2† . Seungryong Kim^1†

¹KAIST ²Korea University ³Samgsung Electronics
*: Equal Contribution †: Corresponding Author

ZeroCo is a zero-shot correspondence model that demonstrates the effectiveness of cross-attention maps, learned through cross-view completion training, in capturing correspondences.

🔍 Overview

In this work, we explore a novel perspective on cross-view completion learning by drawing an analogy to self-supervised correspondence learning. Through our analysis, we show that cross-attention maps in cross-view completion capture correspondences more effectively than correlations derived from encoder or decoder features.

This repository introduces ZeroCo, a zero-shot correspondence model designed to demonstrate that cross-attention maps encode rich correspondences. Additionally, we provide ZeroCo-Flow and ZeroCo-Depth, which extend ZeroCo for learning-based matching and multi-frame depth estimation, respectively.

🛠️ What to expect

Release Zeroco code
Release Zeroco-flow and Zeroco-depth code
Release pretrained weights

Environment

Create and activate conda environment with python 3.10.

conda create -n ZeroCo python=3.10.15
conda activate ZeroCo

Our code is developed based on pytorch 2.1.2 and CUDA 12.1. Please refer to the requirements.txt file to install the necessary dependencies.

pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

Create admin/local by running the following command and update the paths to the dataset

python -c "from admin.environment import create_default_local_file; create_default_local_file()"

Evaluation Datasets

For the evaluation of the zero-shot correspondence task, we used the HPatches and ETH3D datasets.
You can proceed with the download and preprocessing using the following bash script.
```
bash download_ETH3D.sh
bash download_hpatches.sh
```

Prepare Pretrained Weights

Since we evaluate using a pretrained model for Cross-view Completion, it is necessary to download the pretrained weights.
The models currently implemented in our code are as follows. Please visit each repository to obtain the pretrained weights and download them into the ./pretrained_weights folder.
- CroCo: Cross-view completion pretrained model (Our baseline).
- DUSt3R: 3D pointmap regressor model based on CroCo.
- MASt3R: Feature matching model based on CroCo and DUSt3R.
Additionally, you can directly evaluate models with the same architecture as DUSt3R, such as MonST3R.

Zero-shot Evaluation

The scripts folder contains multiple bash files for evaluating models on either the HPatches or ETH3D datasets. Most experiments were conducted on HPatches. For each model, you can perform zero-shot evaluation of geometric matching performance using one of three methods:

Available Methods

Encoder Correlation: Uses encoder features to build a correlation
Decoder Correlation: Uses decoder features to build a correlation
Cross-Attention Maps: Uses cross-attention maps for correlation

For detailed explanations of each method, please refer to our paper.

Example Commands

# HPatches (Original Resolution) - CroCov2
bash scripts/run_hp_crocov2_Largebase.sh

# HPatches (240 Resolution) - CroCov2
bash scripts/run_hp240_crocov2_LargeBase.sh

# ETH3D - CroCov2 
bash scripts/run_eth3d_crocov2_LargeBase.sh

Script Configuration Details

Each evaluation script contains several key parameters that can be customized:

# Example evaluation script
CUDA=0  # Specify GPU device rank
CUDA_VISIBLE_DEVICES=${CUDA} python -u eval_matching.py \
    --seed 2024                     # Random Seed Selection
    --dataset hp                    # Dataset (hp: HPatches, hp-240: HPatches (240x240), eth3d: ETH3D)
    --model_img_size 224 224        # CVC model's Input image dimensions
    --model crocov2                 # Model type [crocov1, crocov2, dust3r, mast3r]
    --pre_trained_models croco      # Pre-trained model type
    --croco_ckpt /path/to/croco/ckpts/CroCo_V2_ViTLarge_BaseDecoder.pth
    --output_mode ca_map            # Correlation method choose from [enc_feat, dec_feat, ca_map]
    --output_ca_map                 # Enable cross-attention map output
    --reciprocity                   # Enable reciprocal cross-attention map
    --save_dir /path/to/save/images/for/visualisation/

🙏 Acknowledgements

This code is heavily based on DenseMatching, We highly appreciate the authors for their great work.

📚 Citation

If you found this code useful, please consider citing our paper.

@article{an2024cross,
  title={Cross-View Completion Models are Zero-shot Correspondence Estimators},
  author={An, Honggyu and Kim, Jinhyeon and Park, Seonghoon and Jung, Jaewoo and Han, Jisang and Hong, Sunghwan and Kim, Seungryong},
  journal={arXiv preprint arXiv:2412.09072},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
admin		admin
assets		assets
datasets		datasets
dust3r		dust3r
mast3r		mast3r
models		models
scripts		scripts
utils_data		utils_data
utils_flow		utils_flow
validation		validation
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
eval_matching.py		eval_matching.py
model_selection.py		model_selection.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cross-View Completion Models are
Zero-shot Correspondence Estimators

CVPR 2025 Highlight

🔍 Overview

🛠️ What to expect

Environment

Evaluation Datasets

Prepare Pretrained Weights

Zero-shot Evaluation

Available Methods

Example Commands

🙏 Acknowledgements

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

cvlab-kaist/ZeroCo

Folders and files

Latest commit

History

Repository files navigation

Cross-View Completion Models are Zero-shot Correspondence Estimators

CVPR 2025 Highlight

🔍 Overview

🛠️ What to expect

Environment

Evaluation Datasets

Prepare Pretrained Weights

Zero-shot Evaluation

Available Methods

Example Commands

🙏 Acknowledgements

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Cross-View Completion Models are
Zero-shot Correspondence Estimators

Packages