agent-distillation
is a library that supports distillation of large language agents into small langauge models, with just a few scripts!
This library accompanies our academic paper, Distilling LLM Agents into Small Models with Retrieval and Code Tools, where we demonstrate how small language models can learn to act like powerful LLM agents by mimicking their agentic behaviors, augmented with retrieval and code execution capabilities.
Built on top of smolagents
v1.13.0.dev0, this library supercharges the agent training pipeline with essential utilities for logging, training, and benchmarking, all optimized for simplicity and reproducibility.
In addition to the powerful capabilities of smolagents
, this library introduces:
- 📜 Logging: Seamlessly save agent run logs to create training-ready trajectories.
- 🎓 Training: Use TRL's SFT trainer to train small agents that remain compatible with
smolagents
. - 📊 Benchmarking: Evaluate your distilled agents on factual and mathematical reasoning benchmarks using a single script.
- [2025.05] We open-source the Agent Distillation codebase.
To install with the required libraries:
conda create -n agents python=3.12
conda activate agents
pip install -e .[distill]
Note: If you want to run benchmarking, place your OpenAI API key in a file at
keys/openai-key/key.env
. This is required for LLM-as-a-judge evaluation on factual reasoning benchmarks.
Want to reproduce or extend our retriever-enhanced setup? We follow the Search-R1 environment.
Expand the section below for setup instructions.
Open for the detailed setup guideline.
- Make a conda environment for the retriever.
conda create -n retriever python=3.10
conda activate retriever
- Install related libraries.
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers datasets pyserini
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
pip install uvicorn fastapi
- Save the index and corpus from the repo.
save_path=./search/database/wikipedia
mkdir -p $save_path
python scripts/download.sh --save_path $save_path
cat $save_path/part_* > $save_path/e5_Flat.index
gzip -d $save_path/wiki-18.jsonl.gz
All scripts assume access to 4 GPUs.
- 🧪 Generate Trajectories from Teacher Agent
bash scripts/inference/run_agent_teacher_train.sh
- 🎓 Train the Student Age 6EAF nt
bash scripts/training/train_agent.sh Qwen/Qwen2.5-1.5B-Instruct
- ✅ Evaluate the Trained Agent on Benchmarks
Runs with self-consistent action generation enabled by default:
bash scripts/inference/run_agent_student.sh Qwen/Qwen2.5-1.5B-Instruct training_outputs/qwen-1.5B-instruct/agent_baseline_qwen2.5_32B_teacher
Or test manually:
bash scripts/inference/serve_slm.sh
# In a separate terminal:
python examples/test_small_agent.py
Curious about more capabilities? Check out the original smolagents repository for advanced usage and custom environments.
- Release teacher trajectories and distilled small LMs as baselines.
- Add detailed instructions for first-thought prefix.
- Provide utilities for small LMs to use tools via MCP.
This project is made possible by the foundational work of the following open-source libraries:
-
smolagents: Provides the core framework for building and running lightweight language agents, which we extend for distillation.
-
Search-R1: Supplies a dense retrieval environment used in our retriever-based experiments.
-
TRL: Offers the supervised fine-tuning framework we use to train distilled agents effectively.
We sincerely thank the developers and maintainers of these projects.
This is not an official product of KRAFTON Inc. or DeepAuto.ai. It is released solely for research purposes.