Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!
-
Updated
Jun 17, 2025 - Python
8000
Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4v, Phi4, ...) (AAAI 2025).
chatglm 6b finetuning and alpaca finetuning
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案,包含从训练到推理的完整代码和脚本,以及实践中积累一些经验和结论。)
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory
Train Large Language Models on MLX.
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for papers, thereby assisting researchers in improving the quality of their work.
Fine-Tuning Dataset Auto-Generation for Graph Query Languages.
本项目旨在结合以往研究人员的代表性工作,从多个维度评估sft数据,并自动化过滤sft数据。
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
Add a description, image, and links to the sft topic page so that developers can more easily learn about it.
To associate your repository with the sft topic, visit your repo's landing page and select "manage topics."