8000 GitHub - ali-vosoughi/MMPerspective
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ali-vosoughi/MMPerspective

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 

Repository files navigation

✨ MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

🌐 Homepage | 🔬 Paper | 👩‍💻 Code | 📊 Dataset | 📈 Evaluation | 🏆 Leaderboard

What is MMPerspective?

MMPerspective is a comprehensive benchmark designed to systematically evaluate the understanding of perspective geometry by Multimodal Large Language Models (MLLMs). It comprises 10 diverse tasks across three key dimensions: Perspective Perception, Reasoning, and Robustness, with 2,711 real-world and synthetic image instances.

alt text

MMPerspective enables researchers and practitioners to uncover the strengths, limitations, and potential areas for improvement in MLLMs, offering valuable insights into the challenges of understanding perspective geometry.

🏆 Leaderboard

Link

📉 Statistics

alt text

Link

Data Curation Pipeline

alt text

👀 Visualization Results

alt text

✏️ Citation

@article{tang2025mmperspective,
  title = {MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness},
  author = {Tang, Yunlong and Liu, Pinxin and Feng, Mingqian and Tan, Zhangyun and Mao, Rui and Huang, Chao and Bi, Jing and Xiao, Yunzhong and Liang, Susan and Hua, Hang and Vosoughi, Ali and Song, Luchuan and Zhang, Zeliang and Xu, Chenliang},
  journal = {arXiv preprint arXiv:2505.20426},
  year = {2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0