sft

Here are 44 public repositories matching this topic...

oumi-ai / oumi

Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!

evaluation inference llama fine-tuning sft dpo llms vlms

Updated Jun 17, 2025
Python

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4v, Phi4, ...) (AAAI 2025).

Updated Jun 18, 2025
Python

AI-Hypercomputer / maxtext

Star

A simple, performant and scalable Jax LLM!

gpt mistral fine-tuning jax sft large-language-models llm llama2 mixtral deepseek llama3 gemma2 gemma3 llama4

Updated Jun 18, 2025
Python

ssbuild / chatglm_finetuning

Star

chatglm 6b finetuning and alpaca finetuning

deep-learning pytorch freeze lora sft chatglm p-tuning-v2 adalora qlora ia3

Updated Mar 9, 2025
Python

ScienceOne-AI / DeepSeek-671B-SFT-Guide

Star

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)

python moe sft llm deepseek-r1

Updated Mar 13, 2025
Python

jerry1993-tech / Cornucopia-LLaMA-Fin-Chinese

Star

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型，并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

nlp finance qa transformers text-generation chinese llama sft large-language-models rlhf

Updated Jun 30, 2023
Python

choosewhatulike / trainable-agents

Star

Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"

agent natural-language-processing character roleplay language-model sft large-language-models llm

Updated Oct 29, 2024
Python

open-sciencelab / GraphGen

Star

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

qa knowledge-graph data-generation sft pretrain ai4science llm llm-training qwen xtuner sft-data

Updated Jun 16, 2025
Python

liangyuwang / zo2

Star

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory

offloading sft zeroth-order-optimization llms

Updated May 2, 2025
Python

NiuTrans / Vision-LLM-Alignment

Star

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

vision alignment multi-model reward ppo sft dpo llm rlhf mllm llava llama3-vision

Updated Jun 18, 2025
Python

Goekdeniz-Guelmez / mlx-lm-lora

Star

Train Large Language Models on MLX.

training apple deep-learning ml mlx sft dpo grpo grpotrainer

Updated Jun 18, 2025
Python

OpenSparseLLMs / LLaMA-MoE-v2

Star

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

sparsity moe attention llama fine-tuning sft mixture-of-experts instruction-tuning llama3

Updated Dec 3, 2024
Python

Zeyi-Lin / Qwen3-Medical-SFT

Star

Qwen3 Fine-tuning: Medical R1 Style Chat

r1 fine-tuning sft qwen3

Updated May 31, 2025
Python

ecnu-sea / SEA

Star

SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for papers, thereby assisting researchers in improving the quality of their work.

natural-language-processing dataset peer-review sft llm domain-llm automated-peer-reviewing

Updated Nov 25, 2024
Python

TuGraph-family / Awesome-Text2GQL

Star

Fine-Tuning Dataset Auto-Generation for Graph Query Languages.

graphdb awesome-list hacktoberfest fine-tuning peft sft text2sql llm text2gql

Updated Jun 16, 2025
Python

ssbuild / moss_finetuning

Star

moss chat finetuning

chat lora moss sft adalora finetuing chatmoss qlora

Updated Apr 23, 2024
Python

muyu42 / DataS

Star

本项目旨在结合以往研究人员的代表性工作，从多个维度评估sft数据，并自动化过滤sft数据。

data-engineering sft llm-training

Updated Feb 28, 2024
Python

sylvain-wei / 24-Game-Reasoning

Star

超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1

alignment reasoning r1 post-training cot sft o1 24game llm rlhf deepseek r1-zero verl long-cot

Updated Apr 5, 2025
Python

wangclnlp / DeepSpeed-Chat-Extension

Star

This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).

llama sft deepspeed llm rlhf

Updated Jul 2, 2024
Python

liziniu / cold_start_rl

Star

Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?

diversity reinforcement-learning exploration cold-start reasoning sft large-language-models

Updated Mar 9, 2025
Python

Improve this page

Add a description, image, and links to the sft topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sft topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sft

Here are 44 public repositories matching this topic...

oumi-ai / oumi

modelscope / ms-swift

AI-Hypercomputer / maxtext

ssbuild / chatglm_finetuning

ScienceOne-AI / DeepSeek-671B-SFT-Guide

jerry1993-tech / Cornucopia-LLaMA-Fin-Chinese

choosewhatulike / trainable-agents

open-sciencelab / GraphGen

liangyuwang / zo2

NiuTrans / Vision-LLM-Alignment

Goekdeniz-Guelmez / mlx-lm-lora

OpenSparseLLMs / LLaMA-MoE-v2

Zeyi-Lin / Qwen3-Medical-SFT

ecnu-sea / SEA

TuGraph-family / Awesome-Text2GQL

ssbuild / moss_finetuning

muyu42 / DataS

sylvain-wei / 24-Game-Reasoning

wangclnlp / DeepSpeed-Chat-Extension

liziniu / cold_start_rl

Improve this page

Add this topic to your repo