8000 seanzhuh (seanZhuh) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View seanzhuh's full-sized avatar
🧢
wondering
🧢
wondering

Highlights

  • Pro

Block or report seanzhuh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Implementation for Describe Anything: Detailed Localized Image and Video Captioning

Python 1,119 59 Updated May 6, 2025
Python 144 10000 14 Updated Apr 23, 2025

[Fully open] [Encoder-free MLLM] Vision as LoRA

Python 277 23 Updated May 26, 2025

Code for Scaling Language-Free Visual Representation Learning (WebSSL)

244 2 Updated Apr 24, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,577 837 Updated Apr 29, 2025

[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Python 456 40 Updated Oct 23, 2024

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 483 22 Updated Jan 13, 2025

系统梳理机器学习的各个知识点。

126 32 Updated Jan 19, 2019

Bird's Eye View Perception

576 31 Updated Apr 6, 2025

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,308 169 Updated Mar 28, 2025

Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM

Python 37 2 Updated Oct 12, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 15,620 1,787 Updated Dec 25, 2024

[ECCV 2024] The official repo for "Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing"

Python 174 6 Updated Nov 23, 2024

Personal Implementation of the paper: Nuvo: Neural UV Mapping for Unruly 3D Representations

Python 35 1 Updated Dec 12, 2024

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

181 7 Updated Apr 3, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

557 20 Updated May 8, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 8,051 496 Updated May 18, 2025

This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation", accepted by CVPR 2024.

70 Updated Jun 3, 2024

[ECCV 2024] Tokenize Anything via Prompting

Jupyter Notebook 582 23 Updated Dec 11, 2024
Python 105 2 Updated Jun 11, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 566 41 Updated May 8, 2024

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

Jupyter Notebook 95 8 Updated Jul 15, 2024

The repository for Hyperbolic Representation Learning for Computer Vision, ECCV 2022

Jupyter Notebook 63 5 Updated Oct 23, 2022

Curated list of awesome works on unsupervised object localization in 2D images.

71 2 Updated Aug 19, 2024

[arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,387 112 Updated Aug 19, 2024

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)

Python 1,318 292 Updated Sep 7, 2023
Python 8,625 506 Updated Oct 9, 2024

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

Python 571 37 Updated Jan 17, 2024
Next
0