8000 alanMachineLeraning (abcalan) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View alanMachineLeraning's full-sized avatar
🤒
Out sick
🤒
Out sick
  • 哈哈哈哈
  • china

Block or report alanMachineLeraning

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Bulletproof Way to Generate Structured JSON from Language Models

Jupyter Notebook 4,756 180 Updated Feb 24, 2024

Open-source unified multimodal model

Python 4,360 364 Updated Jun 17, 2025

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 1,458 62 Updated Jun 26, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,511 2,201 Updated Jun 27, 2025
Python 84 6 Updated Jun 10, 2025

Next-Token Prediction is All You Need

Python 2,156 81 Updated Mar 17, 2025

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 3,925 207 Updated Apr 15, 2025

[CVPR 2025] Official repo for ART:Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Jupyter Notebook 314 36 Updated Jun 17, 2025

Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning

Python 2,645 251 Updated Jun 10, 2025
Python 534 30 Updated Nov 26, 2024

verl: Volcano Engine Reinforcement Learning for LLMs

Python 10,056 1,655 Updated Jun 27, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,327 717 Updated Jun 27, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,071 6,499 Updated Jun 27, 2025

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 964 133 Updated Apr 12, 2024

An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"

Python 163 17 Updated Apr 6, 2024

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

Python 17 1 Updated Feb 19, 2025

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Python 280 12 Updated Jun 13, 2024

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 381 34 Updated May 8, 2025

Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Grounding"

Python 50 2 Updated Dec 30, 2023

[ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation,

Jupyter Notebook 30 3 Updated Mar 20, 2025

Video Reasoning Segmentation

21 Updated Nov 29, 2024

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 240 12 Updated Dec 30, 2024

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,270 162 Updated Feb 16, 2025

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 569 44 Updated Jun 7, 2024

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,303 50 Updated May 30, 2025

Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".

Python 153 3 Updated Dec 13, 2024

NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Python 550 35 Updated Oct 20, 2024

Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"

Python 41 2 Updated Feb 10, 2025
Next
0