GitHub - SceneCOT/scenecot: Step-by-step reasoning framework for 3D scene understanding

SceneCOT: Eliciting Chain-of-Thought Reasoning in 3D Scenes

Xiongkun Linghu, Jiangyong Huang, Ziyu Zhu, Baoxiong Jia, Siyuan Huang

SceneCOT: We propose a Chain-of-Thought reasoning method in 3D scenes (SceneCOT), decoupling a complex reasoning task into simpler and manageable problems, and building corresponding visual clues based on multimodal expert modules. To our knowledge, this is the first attempt to successfully implement the COT technique for achieving human-like step-by-step reasoning for 3D scene understanding, where we show great potential in extending it to a wider range of 3D scene understanding scenarios.

SceneCOT Framework

SceneCOT achieves great performance on MSQA, and Beacon3D, demonstrating the effectiveness of our reasoning framework. Especially, our method significanlty enhances the performance on counting, the most challenging task in MSQA. Our method also significanlty outperforms previous methods by a large margin in Beacon3D.

🔥 News

[2025-6] We released the webpage of SceneCOT.

📝 TODO List

Evaluation code
Model weights
SceneCOT-212K dataset
Training code

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
static/images		static/images
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SceneCOT: Eliciting Chain-of-Thought Reasoning in 3D Scenes

SceneCOT Framework

🔥 News

📝 TODO List

About

Uh oh!

Releases

Packages

License

SceneCOT/scenecot

Folders and files

Latest commit

History

Repository files navigation

SceneCOT: Eliciting Chain-of-Thought Reasoning in 3D Scenes

SceneCOT Framework

🔥 News

📝 TODO List

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages