8000 hx8563 / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View hx8563's full-sized avatar

Block or report hx8563

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Lightweight Python library for adding real-time multi-object tracking to any detector.

Python 2,491 263 Updated Apr 30, 2025

COLMAP - Structure-from-Motion and Multi-View Stereo

C++ 8,737 1,658 Updated Jun 5, 2025

🍽️ Annotations for the public release of the EPIC-KITCHENS-100 dataset

Python 148 28 Updated Aug 1, 2022

Enhancing Zero-shot Image Retrieval with Vision Foundation Models

Python 3 Updated Nov 22, 2024
Python 1 Updated Apr 21, 2025

[ICML 2024] Official code repository for 3D embodied generalist agent LEO

Python 439 39 Updated Apr 20, 2025
Python 211 11 Updated May 27, 2025

BoT-SORT: Robust Associations Multi-Pedestrian Tracking

Jupyter Notebook 1,087 448 Updated Aug 8, 2024

Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"

Python 782 73 Updated Apr 7, 2024

[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer

Python 7,265 743 Updated Jun 3, 2025

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

Python 379 37 Updated Jun 5, 2025

Code for "Open Vocabulary Monocular 3D Object Detection"

Python 49 2 Updated Apr 28, 2025
Python 49 1 Updated Apr 30, 2025

[ECCV24] Keypoint Promptable Re-Identification: SOTA ReID method robust to occlusions and multi-person ambiguity

Python 134 16 Updated Feb 2, 2025

An MCP-based chatbot | 一个基于MCP的聊天机器人

C++ 14,589 2,785 Updated Jun 6, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,080 310 Updated May 11, 2025

[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"

Python 201 16 Updated Dec 14, 2024

Compose multimodal datasets 🎹

Python 394 17 Updated Jun 1, 2025
Python 16 1 Updated Jun 5, 2025
Python 516 72 Updated Feb 21, 2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,172 80 Updated Jan 23, 2025

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,842 174 Updated May 26, 2025

MoVQGAN - model for the image encoding and reconstruction

Jupyter Notebook 239 16 Updated Oct 31, 2023

Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"

Python 145 7 Updated Feb 25, 2025

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,918 188 Updated May 19, 2025

[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".

Python 518 53 Updated Jul 9, 2024

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Python 2,136 306 Updated Mar 3, 2025

SpatialLM: Large Language Model for Spatial Understanding

Python 3,227 250 Updated Mar 28, 2025

[CVPR 2025] The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"

Python 206 11 Updated May 12, 2025
Next
0