8000 GitHub - yyht/agent-distillation
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

yyht/agent-distillation

 
 

Repository files navigation

Agent Distillation

Alt text

agent-distillation is a library that supports distillation of large language agents into small langauge models, with just a few scripts!

This library accompanies our academic paper, Distilling LLM Agents into Small Models with Retrieval and Code Tools, where we demonstrate how small language models can learn to act like powerful LLM agents by mimicking their agentic behaviors, augmented with retrieval and code execution capabilities.

Built on top of smolagents v1.13.0.dev0, this library supercharges the agent training pipeline with essential utilities for logging, training, and benchmarking, all optimized for simplicity and reproducibility.

🔧 What This Library Offers

In addition to the powerful capabilities of smolagents, this library introduces:

  1. 📜 Logging: Seamlessly save agent run logs to create training-ready trajectories.
  2. 🎓 Training: Use TRL's SFT trainer to train small agents that remain compatible with smolagents.
  3. 📊 Benchmarking: Evaluate your distilled agents on factual and mathematical reasoning benchmarks using a single script.

Recent Updates

  • [2025.05] We open-source the Agent Distillation codebase.

📦 Contents

  1. Installation
  2. Quickstart: How to Distill Agents
  3. Acknowledgements

🛠 Installation

To install with the required libraries:

conda create -n agents python=3.12
conda activate agents
pip install -e .[distill]

Note: If you want to run benchmarking, place your OpenAI API key in a file at keys/openai-key/key.env. This is required for LLM-as-a-judge evaluation on factual reasoning benchmarks.

➕ Optional: Retriever Environment (used in our paper)

Want to reproduce or extend our retriever-enhanced setup? We follow the Search-R1 environment.

Expand the section below for setup instructions.

Open for the detailed setup guideline.
  1. Make a conda environment for the retriever.
conda create -n retriever python=3.10
conda activate retriever
  1. Install related libraries.
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers datasets pyserini
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
pip install uvicorn fastapi
  1. Save the index and corpus from the repo.
save_path=./search/database/wikipedia
mkdir -p $save_path
python scripts/download.sh --save_path $save_path
cat $save_path/part_* > $save_path/e5_Flat.index
gzip -d $save_path/wiki-18.jsonl.gz

⚗️ Quickstart: How to Distill Agents

All scripts assume access to 4 GPUs.

  1. 🧪 Generate Trajectories from Teacher Agent
bash scripts/inference/run_agent_teacher_train.sh
  1. 🎓 Train the Student Age 6EAF nt
bash scripts/training/train_agent.sh Qwen/Qwen2.5-1.5B-Instruct
  1. ✅ Evaluate the Trained Agent on Benchmarks

Runs with self-consistent action generation enabled by default:

bash scripts/inference/run_agent_student.sh Qwen/Qwen2.5-1.5B-Instruct training_outputs/qwen-1.5B-instruct/agent_baseline_qwen2.5_32B_teacher

Or test manually:

bash scripts/inference/serve_slm.sh
# In a separate terminal:
python examples/test_small_agent.py

More on smolagents

Curious about more capabilities? Check out the original smolagents repository for advanced usage and custom environments.

🚧 Future Plan

  • Release teacher trajectories and distilled small LMs as baselines.
  • Add detailed instructions for first-thought prefix.
  • Provide utilities for small LMs to use tools via MCP.

🙏 Acknowledgements

This project is made possible by the foundational work of the following open-source libraries:

  • smolagents: Provides the core framework for building and running lightweight language agents, which we extend for distillation.

  • Search-R1: Supplies a dense retrieval environment used in our retriever-based experiments.

  • TRL: Offers the supervised fine-tuning framework we use to train distilled agents effectively.

We sincerely thank the developers and maintainers of these projects.

⚠️ Disclaimer

This is not an official product of KRAFTON Inc. or DeepAuto.ai. It is released solely for research purposes.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.3%
  • Shell 3.6%
  • Makefile 0.1%
0