Zidong Cao1 · Jinjing Zhu1 · Weiming Zhang1 · Hao Ai2 · Haotian Bai1
Hengshuang Zhao3 · Lin Wang4†
1AI Thrust, HKUST (GZ) 2University of Birmingham 3HKU 4NTU
†corresponding author
We propose a semi-supervised learning framework to learn a panoramic Depth Anything, dubbed PanDA. PanDA first learns a teacher model by fine-tuning Depth Anything through joint training on synthetic indoor and outdoor panoramic datasets. Then, a student model is trained using large-scale unlabeled data, leveraging pseudo-labels generated by the teacher model. PanDA exhibits impressive zero-shot capability across diverse scenes.
- 2025-03-18: Code release.
- 2025-02-27: PanDA is accepted by CVPR 2025.
- 2025-02-06: Our survey about 360 vision is accepted by IJCV. Hope the survey helpful for you. [Link]
In our manuscript, some panoramas (RGB, without depth labels) are captured by ourselves. The dataset link is here. It contains about 10,000 panoramas of 4K resolution. Note that we do not claim the dataset is a technical contribution.
We provide three models for robust relative panoramic depth estimation (predict depth values, range 0~1):
Model | Params | Checkpoint |
---|---|---|
PanDA-Small | 24.8M | Download |
PanDA-Base | 97.5M | Download |
PanDA-Large | 335.3M | Download |
git clone https://github.com/caozidong/PanDA
cd PanDA
pip install -r requirements.txt
Note: We use python==3.10, and pytorch==2.0.0, cuda==11.7, and cudnn==8.5.0.
Download the checkpoints listed here and put them under the checkpoints
directory.
python run.py \
--config ./config/inference/panda_<large, base, small> \
--img-path <path> --outdir <outdir> \
[--height <height>] [--resize] [--pred-only] \
[--grayscale] [--save-cloud]
Options:
--config
: Model config files.--img-path
: It supports an image directory, a single image path, and a text file storing image paths.--height
: The height of ERP image. The width is [2 x height]. By default, the height is 504. Increasing the height can obtain better predictions (1008x2016 requires more than 40GB GPU memory).--resize
(optional): If resizing the output depth to have the same spatial resolution as the input ERP image.--pred-only
(optional): Only save the predicted depth map, without raw image.--grayscale
(optional): Save the grayscale depth map, without applying color palette.--save-cloud
(optional): Save the colored point cloud result.
For example:
python run_image.py --config ./config/inference/panda_large.yaml \
--img-path ./erp_samples/ --pred-only
python run_video.py \
--config ./config/inference/panda_<large, base, small> \
--video-path assets/examples_video --outdir video_depth_vis \
[--height <height>]
Please refer to train_teacher.
Please refer to train_student.
Please refer to train_metric depth.
We sincerely thank the Depth Anything v1, Depth Anything v2 for contributing such impressive models and codes to our community. Also, we sincerely thank the UniFuse for providing training and evaluation codes.