LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 483 22 Updated Jan 13, 2025

Yimeng-Zhang / Machine-Learning-From-Scratch

系统梳理机器学习的各个知识点。

126 32 Updated Jan 19, 2019

vasgaowei / BEV-Perception

Bird's Eye View Perception

576 31 Updated Apr 6, 2025

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,308 169 Updated Mar 28, 2025

yixuan730 / DetToolChain

Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM

Python 37 2 Updated Oct 12, 2024

scratchapixel / scratchapixel-code

GLSL 358 73 Updated Oct 16, 2024

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 15,620 1,787 Updated Dec 25, 2024

slothfulxtx / Texture-GS

[ECCV 2024] The official repo for "Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing"

Python 174 6 Updated Nov 23, 2024

ruiqixu37 / Nuvo

Personal Implementation of the paper: Nuvo: Neural UV Mapping for Unruly 3D Representations

Python 35 1 Updated Dec 12, 2024

seanzhuh / Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

181 7 Updated Apr 3, 2025

MME-Benchmarks / Video-MME

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

557 20 Updated May 8, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 8,051 496 Updated May 18, 2025

Rubics-Xuan / MRES

This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation", accepted by CVPR 2024.

70 Updated Jun 3, 2024

baaivision / tokenize-anything

[ECCV 2024] Tokenize Anything via Prompting

Jupyter Notebook 582 23 Updated Dec 11, 2024

V3Det / V3Det

Python 105 2 Updated Jun 11, 2024

shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 566 41 Updated May 8, 2024

bytedance / OmniScient-Model

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

Jupyter Notebook 95 8 Updated Jul 15, 2024

MinaGhadimiAtigh / hyperbolic_representation_learning

The repository for Hyperbolic Representation Learning for Computer Vision, ECCV 2022

Jupyter Notebook 63 5 Updated Oct 23, 2022

valeoai / Awesome-Unsupervised-Object-Localization

Curated list of awesome works on unsupervised object localization in 2D images.

71 2 Updated Aug 19, 2024

microsoft / SoM

[arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,387 112 Updated Aug 19, 2024

dome272 / Diffusion-Models-pytorch

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)

Python 1,318 292 Updated Sep 7, 2023

apple / ml-ferret

Python 8,625 506 Updated Oct 9, 2024

baaivision / Uni3D

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

Python 571 37 Updated Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seanZhuh seanzhuh

Achievements

Achievements

Highlights

Block or report seanzhuh

Stars

NVlabs / describe-anything

Alpha-Innovator / OmniCaptioner

Hon-Wong / VoRA

dfan / webssl

deepseek-ai / FlashMLA

filaPro / oneformer3d

deepseek-ai / DeepSeek-R1

ictnlp / LLaVA-Mini