8000 weikaih04 (Weikai Huang) / Starred · GitHub

More Web Proxy on the site http://driver.im/

weikaih04

Follow

Weikai Huang weikaih04

Follow

2nd year undergrad @ UW | CSE RAIVN Lab

9 followers · 15 following

Seattle, Washington, United States
weikaih04.github.io

Achievements

Achievements

Highlights

Pro

Stars

QwenLM / QwQ

QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.

Python 508 20 Updated Mar 27, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 11,376 831 Updated May 15, 2025

LLaVA-VL / LLaVA-NeXT

Python 3,989 375 Updated Jun 13, 2025

UMass-Embodied-AGI / Mirage

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)

Python 88 6 Updated Jun 30, 2025

weixi-feng / LayoutGPT

Official repo for LayoutGPT

Python 357 22 Updated Apr 10, 2024

HuiZhang0812 / CreatiLayout

Python 100 10 Updated Apr 22, 2025

NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Python 2,233 315 Updated Mar 3, 2025

taeyeopl / Any6D

[CVPR 2025] Any6D: Model-free 6D Pose Estimation of Novel Objects

Jupyter Notebook 223 8 Updated Jun 5, 2025

JiehongLin / SAM-6D

[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".

Python 539 58 Updated Jul 9, 2024

google-research / kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,528 247 Updated May 6, 2025

Chain-Mao / Stereo3DMOT

Official PyTorch Implementation for "Stereo3DMOT: Stereo Vision Based 3D Multi-Object Tracking with Multimodal ReID, PRCV2023"

Python 22 1 Updated Jul 8, 2024

fabiotosi92 / Awesome-Deep-Stereo-Matching

A curated list of awesome Deep Stereo Matching resources

TeX 424 22 Updated Jul 3, 2025

YvanYin / Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Python 1,834 136 Updated Mar 13, 2025

CUT3R / CUT3R

Official implementation of Continuous 3D Perception Model with Persistent State

Python 917 49 Updated Jul 3, 2025

dvlab-research / 3D-Box-Segment-Anything

We extend Segment Anything to 3D perception by combining it with VoxelNeXt.

Jupyter Notebook 557 25 Updated Apr 18, 2023

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,450 904 Updated Jul 5, 2025

pointarena / pointarena

Python 18 1 Updated May 21, 2025

kxhit / EscherNet

[CVPR2024 Oral] EscherNet: A Generative Model for Scalable View Synthesis

Python 340 19 Updated Sep 10, 2024

lllyasviel / LayerDiffuse

Transparent Image Layer Diffusion using Latent Transparency

2,139 31 Updated Jun 16, 2024

ruhyadi / YOLO3D

YOLO 3D Object Detection for Autonomous Driving Vehicle

Python 326 56 Updated Jun 30, 2024

UVA-Computer-Vision-Lab / ovmono3d

Code for "Open Vocabulary Monocular 3D Object Detection"

Python 54 4 Updated Apr 28, 2025

OpenDriveLab / DetAny3D

[ICCV 2025] Detect Anything 3D in the Wild

Python 119 1 Updated Jul 2, 2025

KwaiVGI / 3DTrajMaster

[ICLR'25] 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Jupyter Notebook 349 15 Updated Jul 4, 2025

mayuelala / Awesome-Controllable-Video-Generation

🚀🚀🚀A curated list of papers on controllable video generation.

285 22 Updated Jul 1, 2025

wufeim / DST3D

Official implementation of "Generating images with 3D annotations using diffusion models".

Python 49 6 Updated Aug 21, 2024

oooolga / Ctrl-V

👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"

Python 27 2 Updated Oct 24, 2024

Stereo4d / stereo4d-code

Stereo4D dataset and processing code

Jupyter Notebook 247 6 Updated Apr 15, 2025

facebookresearch / metamorph

Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning

Python 193 7 Updated Apr 19, 2025

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 22,578 1,898 Updated Mar 26, 2025

TripleJoy / SAM2MOT

118 4 Updated Apr 17, 2025

0