csuking

csuking

2 followers · 1 following

Stars

JiuhaiChen / BLIP3o

Python 1,181 44 Updated Jun 14, 2025

zitian-gao / one-shot-em

One-shot Entropy Minimization

Python 137 7 Updated Jun 13, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 4,083 314 Updated Jun 15, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,151 244 Updated Jun 12, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,986 785 Updated May 15, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 229 23 Updated Jun 3, 2025

PRIME-RL / TTRL

TTRL: Test-Time Reinforcement Learning

Python 628 45 Updated Jun 6, 2025

cs-holder / Reasoning-Self-Evolution-Survey

43 2 Updated Mar 6, 2025

mlfoundations / open_clip

An open source implementation of CLIP.

Python 11,943 1,111 Updated Jun 10, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 49,669 7,995 Updated Jun 15, 2025

hzwer / WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

2,464 82 Updated Jun 5, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,581 2,838 Updated Jun 14, 2025

csuking / ms-swift

Forked from modelscope/ms-swift

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 2 Updated May 10, 2025

SkyworkAI / Skywork-R1V

Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning

Python 2,621 251 Updated Jun 10, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,962 104 Updated Jun 2, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,653 193 Updated Jun 14, 2025

Liuziyu77 / Visual-RFT

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 1,971 84 Updated May 21, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,300 63 Updated Feb 8, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,465 1,292 Updated Jun 14, 2025

aburkov / theLMbook

This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov

Jupyter Notebook 1,779 295 Updated May 21, 2025

modelscope / modelscope-classroom

Jupyter Notebook 947 113 Updated May 9, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,598 844 Updated Apr 29, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,098 702 Updated Jun 15, 2025

agentica-project / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,366 308 Updated May 13, 2025

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 28,779 3,396 Updated Jan 26, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,354 158 Updated Mar 20, 2025

sail-sg / oat-zero

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 240 10 Updated Apr 15, 2025

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

Jupyter Notebook 3,970 494 Updated May 28, 2025

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,982 243 Updated Apr 30, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,627 271 Updated Apr 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly