This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos). We are continuously improving the project. Welcome to PR the works (papers, repos) th…

87 3 Updated Jun 26, 2025

guangqian-guo / GleSAM

The official code of our CVPR2025 paper: "Segment Any-Quality Images with Generative Latent Space Enhancement".

Python 18 1 Updated Jun 29, 2025

congvvc / HyperSeg

Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".

Python 154 3 Updated Dec 13, 2024

xuyang-liu16 / GlobalCom2

🚀 Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models

Python 28 Updated May 21, 2025

DreamMr / HR-Bench

PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models"

Python 23 1 Updated May 13, 2025

datawhalechina / happy-llm

📚 从零开始的大语言模型原理与实践教程

5,039 366 Updated Jun 28, 2025

InterviewReady / ai-engineering-resources

CC4F

Research papers and blogs to transition to AI Engineering

1,272 166 Updated Jun 25, 2025

dair-ai / Prompt-Engineering-Guide

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 58,726 5,857 Updated Jun 19, 2025

360CVGroup / Inner-Adaptor-Architecture

LMM solved catastrophic forgetting, AAAI2025

Python 43 4 Updated Apr 15, 2025

magic-research / Sa2VA

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 1,162 78 Updated Jul 1, 2025

jytmelon / G-Prune

Python 204 1 Updated Apr 16, 2025

ywh187 / FitPrune

Python 51 4 Updated May 5, 2025

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 10,034 739 Updated Jun 4, 2025

mbzuai-oryx / Awesome-LLM-Post-training

Awesome Reasoning LLM Tutorial/Survey/Guide

Python 1,804 127 Updated Jun 16, 2025

FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 570 44 Updated Jun 7, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 11,310 1,146 Updated Jun 27, 2025

Sanster / IOPaint

Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.

Python 21,747 2,207 Updated Apr 29, 2025

microsoft / BitNet

Official inference framework for 1-bit LLMs

Python 20,394 1,529 Updated Jun 3, 2025

opendatalab / LOKI

[ICLR 2025 Spotlight] The official implementation of the paper “LOKI：A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Python 154 4 Updated Mar 31, 2025