Kolo is a lightweight tool designed for fast and efficient fine-tuning and testing of Large Language Models (LLMs) on your local machine. It leverages cutting-edge tools to simplify the fine-tuning process, making it as quick and seamless as possible.
- 🏗 Lightweight: Minimal dependencies, optimized for speed.
- ⚡ Runs Locally: No need for cloud-based services; fine-tune models on your own machine.
- 🛠 Easy Setup: Simple installation and execution with Docker.
- 🔌 Support for Popular Frameworks: Integrates with major LLM toolkits.
Kolo is built using a powerful stack of LLM tools:
- Unsloth – Open-source LLM fine-tuning; faster training, lower VRAM.
- Torchtune – Native PyTorch library simplifying LLM fine-tuning workflows.
- Llama.cpp – Fast C/C++ inference for Llama models.
- Ollama – Portable, user-friendly LLM model management and deployment.
- Docker – Containerized environment ensuring consistent, scalable deployments.
- Open WebUI – Intuitive self-hosted web interface for LLM management.
- Windows 10 OS or higher. Might work on Linux & Mac (Untested)
- Nvidia GPU with CUDA 12.1 capability and 8GB+ of VRAM
- 16GB+ System RAM
Join our Discord group!
Ensure HyperV is installed.
Ensure WSL 2 is installed; alternative guide.
Ensure Docker Desktop is installed.
./build_image.ps1
If running for first time:
./create_and_run_container.ps1
For subsequent runs:
./run_container.ps1
./copy_training_data.ps1 -f examples/God.jsonl -d data.jsonl
./train_model_unsloth.ps1 -OutputDir "GodOutput" -Quantization "Q4_K_M" -TrainData "data.jsonl"
All available parameters
./train_model_unsloth.ps1 -Epochs 3 -LearningRate 1e-4 -TrainData "data.jsonl" -BaseModel "unsloth/Llama-3.2-1B-Instruct-bnb-4bit" -ChatTemplate "llama-3.1" -LoraRank 16 -LoraAlpha 16 -LoraDropout 0 -MaxSeqLength 1024 -WarmupSteps 10 -SaveSteps 500 -SaveTotalLimit 5 -Seed 1337 -SchedulerType "linear" -BatchSize 2 -OutputDir "GodOutput" -Quantization "Q4_K_M" -WeightDecay 0
Requirements: Create a Hugging Face account and create a token. You will also need to get permission from Meta to use their models. Search the Base Model name on Hugging Face website and get access before training.
./train_model_torchtune.ps1 -OutputDir "GodOutput" -Quantization "Q4_K_M" -TrainData "data.json" -HfToken "your_token"
All available parameters
./train_model_torchtune.ps1 -HfToken "your_token" -Epochs 3 -LearningRate 1e-4 -TrainData "data.json" -BaseModel "Meta-llama/Llama-3.2-1B-Instruct" -LoraRank 16 -LoraAlpha 16 -LoraDropout 0 -MaxSeqLength 1024 -WarmupSteps 10 -Seed 1337 -SchedulerType "cosine" -BatchSize 2 -OutputDir "GodOutput" -Quantization "Q4_K_M" -WeightDecay 0
For more information about fine tuning parameters please refer to the Fine Tune Training Guide.
./install_model.ps1 "God" -Tool "unsloth" -OutputDir "GodOutput" -Quantization "Q4_K_M"
./install_model.ps1 "God" -Tool "torchtune" -OutputDir "GodOutput" -Quantization "Q4_K_M"
Open your browser and navigate to localhost:8080
./uninstall_model.ps1 "God"
./list_models.ps1
./delete_model.ps1 "GodOutput" -Tool "unsloth|torchtune"
./copy_scripts.ps1
./copy_configs.ps1
To quickly SSH into the Kolo container for installing additional tools or running scripts directly:
./connect.ps1
If prompted for a password, use:
password 123
Alternatively, you can connect manually via SSH:
ssh root@localhost -p 2222
You can use WinSCP or any other SFTP file manager to access the Kolo container’s file system. This allows you to manage, modify, add, or remove scripts and files easily.
Connection Details:
- Host: localhost
- Port: 2222
- Username: root
- Password: 123
This setup ensures you can easily transfer files between your local machine and the container.