8000 weikaih04 (Weikai Huang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View weikaih04's full-sized avatar

Highlights

  • Pro

Block or report weikaih04

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.

Python 508 20 Updated Mar 27, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 11,376 831 Updated May 15, 2025
Python 3,989 375 Updated Jun 13, 2025

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)

Python 88 6 Updated Jun 30, 2025

Official repo for LayoutGPT

Python 357 22 Updated Apr 10, 2024
Python 100 10 Updated Apr 22, 2025

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Python 2,233 315 Updated Mar 3, 2025

[CVPR 2025] Any6D: Model-free 6D Pose Estimation of Novel Objects

Jupyter Notebook 223 8 Updated Jun 5, 2025

[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".

Python 539 58 Updated Jul 9, 2024

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,528 247 Updated May 6, 2025

Official PyTorch Implementation for "Stereo3DMOT: Stereo Vision Based 3D Multi-Object Tracking with Multimodal ReID, PRCV2023"

Python 22 1 Updated Jul 8, 2024

A curated list of awesome Deep Stereo Matching resources

TeX 424 22 Updated Jul 3, 2025

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Python 1,834 136 Updated Mar 13, 2025

Official implementation of Continuous 3D Perception Model with Persistent State

Python 917 49 Updated Jul 3, 2025

We extend Segment Anything to 3D perception by combining it with VoxelNeXt.

Jupyter Notebook 557 25 Updated Apr 18, 2023

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,450 904 Updated Jul 5, 2025
Python 18 1 Updated May 21, 2025

[CVPR2024 Oral] EscherNet: A Generative Model for Scalable View Synthesis

Python 340 19 Updated Sep 10, 2024

Transparent Image Layer Diffusion using Latent Transparency

2,139 31 Updated Jun 16, 2024

YOLO 3D Object Detection for Autonomous Driving Vehicle

Python 326 56 Updated Jun 30, 2024

Code for "Open Vocabulary Monocular 3D Object Detection"

Python 54 4 Updated Apr 28, 2025

[ICCV 2025] Detect Anything 3D in the Wild

Python 119 1 Updated Jul 2, 2025

[ICLR'25] 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Jupyter Notebook 349 15 Updated Jul 4, 2025

🚀🚀🚀A curated list of papers on controllable video generation.

285 22 Updated Jul 1, 2025

Official implementation of "Generating images with 3D annotations using diffusion models".

Python 49 6 Updated Aug 21, 2024

👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"

Python 27 2 Updated Oct 24, 2024

Stereo4D dataset and processing code

Jupyter Notebook 247 6 Updated Apr 15, 2025

Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning

Python 193 7 Updated Apr 19, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 22,578 1,898 Updated Mar 26, 2025
Next
0