Overview | Installation | Get Started | Documentation | Community | Citing torchtune | License
- February 2025: Multi-node training is officially open for business in torchtune! Full finetune on multiple nodes to take advantage of larger batch sizes and models.
- December 2024: torchtune now supports Llama 3.3 70B! Try it out by following our installation instructions here, then run any of the configs here.
- November 2024: torchtune has released v0.4.0 which includes stable support for exciting features like activation offloading and multimodal QLoRA
- November 2024: torchtune has added Gemma2 to its models!
- October 2024: torchtune added support for Qwen2.5 models - find the configs here
- September 2024: torchtune has support for Llama 3.2 11B Vision, Llama 3.2 3B, and Llama 3.2 1B models! Try them out by following our installation instructions here, then run any of the text configs here or vision configs here.
Β
torchtune is a PyTorch library for easily authoring, post-training, and experimenting with LLMs. It provides:
- Hackable training recipes for SFT, knowledge distillation, DPO, PPO, GRPO, and quantization-aware training
- Simple PyTorch implementations of popular LLMs like Llama, Gemma, Mistral, Phi, Qwen, and more
- Best-in-class memory efficiency, performance improvements, and scaling, utilizing the latest PyTorch APIs
- YAML configs for easily configuring training, evaluation, quantization or inference recipes
Β
torchtune supports the entire post-training lifecycle. A successful post-trained model will likely utilize several of the below methods.
Type of Weight Update | 1 Device | >1 Device | >1 Node |
---|---|---|---|
Full | β | β | β |
LoRA/QLoRA | β | β | β |
Example: tune run lora_finetune_single_device --config llama3_2/3B_lora_single_device
You can also run e.g. tune ls lora_finetune_single_device
for a full list of available configs.