Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
-
Updated
Jun 12, 2025 - Python
8000
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Code for Deep Learning for Modern AI
An implementation of GRPO for Unsloth's VLMs training
Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Kannada, enhancing its ability to generate relevant, context-aware responses based on conversational inputs. Using the Kannada Instruct dataset for fine-tuning! Happy Finetuning 🎋
本项目利用医学领域的 CoT 数据对 Deepseek-R1-Distill-Qwen-7B 进行微调,通过 QLoRA 量化和 Unsloth 加速训练,显著提升模型在复杂医学推理任务中的慢思考能力。知识蒸馏技术使轻量级模型获得大模型的推理优势,实现高效、准确且具有解释性的医学问答系统。
Fine-Tuning LLMs (Gemma, LLaMA, Mistral, etc.) A practical guide to fine-tuning various large language models using popular frameworks. Includes examples, scripts, and tips for efficient training on custom datasets.
Fine-Tuning of DeepSeek-Style Reasoning Models | RL + Quantization Implementation
PTIT's Major Project: Website Programming - This repo contains a chatbot for a clothing store. The chatbot acts as an employee with specific knowledge about clothing consultation, website support, and store information.
Cloning Yourself using your whatsapp chat history 10000 and training a model on it.
Information extraction from unstructured text to build a knowledge graph using techniques from traditional NLP to pre-trained transformers and LLMs for NER and Linking, and Relation Extraction.
Materials for CSE Summer School Hackathon 2024
AstorAI is a user-friendly medical chatbot powered by Retrieval-Augmented Generation (RAG) and the advanced LLama 3 model. It offers real-time, accurate responses to a wide range of medical queries, ensuring privacy and security in every interaction. Designed for ease of use, AstorAI provides reliable health information on various topics 24/7.
Fine-tuning Llama 3.2 3B Instruct model for text generation using Unsloth AI
Finetune Web UI is a user-interface for training and deploying pre-trained models.
ResurrectAI is an AI-driven chat application designed to bring the wisdom and knowledge of great historical personalities to life. Leveraging advanced language models and fine-tuning techniques, ResurrectAI enables users to interact with AI avatars of iconic figures, gaining access to their insights, guidance, and philosophical teaching in realtime
LLM-powered financial analyst using LoRA-tuned Llama-3 and RAG pipeline to answer complex queries over SEC 10-K filings with contextual accuracy.
PDF 문서에서 GPU 가속 처리로 고품질 질의응답(QA) 데이터를 자동 생성하고 LLM을 효율적으로 파인튜닝하는 솔루션입니다. Unstructured 라이브러리와 AWS Bedrock Claude로 도메인 특화 QA 쌍을 생성하고, LoRA 기법으로 경량 모델을 훈련합니다.
Análise Avançada de Dados com Causalidade e Aprendizado por Reforço
Add a description, image, and links to the unsloth topic page so that developers can more easily learn about it.
To associate your repository with the unsloth topic, visit your repo's landing page and select "manage topics."