PE3R: Perception-Efficient 3D Reconstruction. Take 2 - 3 photos with your phone, upload them, wait a few minutes, and then start exploring your 3D world via text!

Python 357 13 Updated Apr 1, 2025

Physical-Intelligence / openpi

Python 3,427 378 Updated May 28, 2025

policy-gradient / GRPO-Zero

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,377 54 Updated Apr 18, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 3,642 361 Updated Apr 27, 2025

TianxingChen / Embodied-AI-Guide

[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide

5,434 353 Updated May 19, 2025

zju3dv / Murre

Code for "Multi-view Reconstruction via SfM-guided Monocular Depth Estimation". CVPR 2025 (Oral Presentation)

Python 274 23 Updated Apr 29, 2025

NVlabs / FoundationStereo

[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching

Python 1,572 92 Updated May 17, 2025

TsinghuaC3I / Awesome-RL-Reasoning-Recipes

Awesome RL Reasoning Recipes ("Triple R")

578 31 Updated May 27, 2025

Zippland / worth-calculator

Calculating the actual value of your job beyond just salary

TypeScript 1,486 81 Updated May 25, 2025

eric-ai-lab / EditRoom

[ICLR 2025] EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

Python 14 Updated Apr 1, 2025

HorizonRobotics / BIP3D

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Python 189 3 Updated Mar 27, 2025

lisj575 / GaussianUDF

Code Release for CVPR (2025), "GaussianUDF: Inferring Unsigned Distance Functions through 3D Gaussian Splatting"

23 Updated Mar 25, 2025

unique1i / SceneSplat

Implementation of the project: SceneSplat - Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining

27 Updated Mar 20, 2025

wen-yuan-zhang / MonoInstance

[CVPR'2025] MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction

11 Updated Apr 25, 2025

vivekmadhavaram / FreeEdit

Towards a Training Free Approach for 3D Scene Editing

Python 1 Updated Apr 11, 2025

manycore-research / SpatialLM

SpatialLM: Large Language Model for Spatial Understanding

Python 3,205 250 Updated Mar 28, 2025

PzySeere / MetaSpatial

MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, realistic, and adaptive scene generation for applications in…

Python 126 5 Updated May 5, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer

Python 7,044 715 Updated May 22, 2025

facebookresearch / fast3r

[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Python 1,071 53 Updated May 7, 2025

TideDra / lmm-r1

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 758 46 Updated May 14, 2025

VAST-AI-Research / MIDI-3D

[CVPR 2025] MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation

Python 661 50 Updated May 27, 2025