8000 GitHub - SceneCOT/scenecot: Step-by-step reasoning framework for 3D scene understanding
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

SceneCOT/scenecot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

SceneCOT: Eliciting Chain-of-Thought Reasoning in 3D Scenes

   
LEO Teaser
SceneCOT: We propose a Chain-of-Thought reasoning method in 3D scenes (SceneCOT), decoupling a complex reasoning task into simpler and manageable problems, and building corresponding visual clues based on multimodal expert modules. To our knowledge, this is the first attempt to successfully implement the COT technique for achieving human-like step-by-step reasoning for 3D scene understanding, where we show great potential in extending it to a wider range of 3D scene understanding scenarios.

SceneCOT Framework

LEO Teaser
SceneCOT achieves great performance on MSQA, and Beacon3D, demonstrating the effectiveness of our reasoning framework. Especially, our method significanlty enhances the performance on counting, the most challenging task in MSQA. Our method also significanlty outperforms previous methods by a large margin in Beacon3D.

🔥 News

  • [2025-6] We released the webpage of SceneCOT.

📝 TODO List

  • Evaluation code
  • Model weights
  • SceneCOT-212K dataset
  • Training code

About

Step-by-step reasoning framework for 3D scene understanding

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0