🔍 Awesome Agentic Search

🎯 Objectives | 📚 Papers | 📊 Slides | 🎮 Demo | 🏆 Arena | 🏋️ Gym

🤖 Agentic search is an advanced AI approach where autonomous agents actively plan and execute multi-step, iterative searches to decompose complex queries, evaluate relevance, and synthesize responses—transforming them from passive retrievers into dynamic, reasoning-driven researchers.

🎯 Objectives

🚧 Note: This project is evolving rapidly—join the community by opening issues, submitting PRs, leaving comments, or ⭐ starring the repo to help build a leading resource for agentic search.

Research Collection: Curate and categorize comprehensive research work in agentic search, including papers, code implementations, and empirical findings
Interactive Demos: Build demonstration pages to showcase different agentic search methods and allow hands-on exploration of their capabilities
Evaluation Arena: Develop a Python toolkit for systematic evaluation and benchmarking of agentic search methods across diverse tasks and metrics
Training Gym: Create a Python framework for training and optimizing agentic search models, including reinforcement learning and other approaches

📚 Papers

For each paper, we provide the following information:

👨‍🎓 First Author · 📧 Corresponding Author (Last Author if not specified) · 🏛️ First Organization · 📊 Dataset

Note: Please submit a PR if we missed anything!

📊 Dataset Types:

General QA: NQ, TriviaQA, PopQA

Multi-Hop QA: HotpotQA, 2wiki, Musique, Bamboogle

Complex Task: GPQA, GAIA, WebWalker QA, Humanity's Last Exam (HLE)

Report Generation: Glaive

Math & Coding: AIME, MATH500, AMC, LiveCodeBench

🎓 Training-based

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

👨‍🎓 Bowen Jin · 📧 Jiawei Han · 🏛️ UIUC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B / 7B · 🎯 Training: GRPO, PPO

An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents

👨‍🎓 Bowen Jin · 📧 Jiawei Han · 🏛️ UIUC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B / 7B / 14B· 🎯 Training: GRPO, PPO

Notes: a new version of Search-R1.

WebThinker: Empowering Large Reasoning Models with Deep Research Capability:

👨‍🎓 Xiaoxi Li · 📧 Zhicheng Dou · 🏛️ GSAI, RUC
📊 Dataset: Complex Task, Report Generation · 🤖 Model: QwQ 32B · 🎯 Training: SFT, DPO

DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments

👨‍🎓 Yuxiang Zheng · 📧 Pengfei Liu · 🏛️ SJTU
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-7B · 🎯 Training: GRPO

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

👨‍🎓 Huatong Song · 📧 Wayne Xin Zhao · 🏛️ GSAI, RUC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-7B, Llama-3.1-8B · 🎯 Training: SFT, GRPO, REINFORCE++

R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning

👨‍🎓 Huatong Song · 📧 Wayne Xin Zhao · 🏛️ GSAI, RUC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-7B · 🎯 Training: SFT, GRPO, REINFORCE++

SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis

👨‍🎓 Shuang Sun · 📧 Wayne Xin Zhao · 🏛️ GSAI, RUC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-7B / 32B, QwQ-32B · 🎯 Training: SFT, DPO, REINFORCE++

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

👨‍🎓 Hao Sun · 📧 Zile Qiao, Jiayan Guo, Yan Zhang · 🏛️ Tongyi Lab
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B / 7B, s LLaMA-3.2-3B · 🎯 Training: REINFORCE, GRPO, PPO

Chain-of-Retrieval Augmented Generation

👨‍🎓 Liang Wang · 📧 Furu Wei · 🏛️ MSRA
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Llama-3.1-8B-Instruct · 🎯 Training: REINFORCE, GRPO, PPO

IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent

👨‍🎓 Ziyang Huang · 📧 Kang Liu · 🏛️ IA, CAS
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B / 7B · 🎯 Training: GRPO

Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging

👨‍🎓 Hongjin Qian · 📧 Zheng Liu · 🏛️ BAAI
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B / 7B · 🎯 Training: GRPO, PPO

Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs

👨‍🎓 Yaorui Shi · 📧 Xiang Wang · 🏛️ USTC
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-3B · 🎯 Training: GRPO

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

👨‍🎓 Changtai Zhu · 📧 Xipeng Qiu · 🏛️ FDU
📊 Dataset: Conversational QA · 🤖 Model: Qwen-2.5-3B / Llama-3.2-3B · 🎯 Training: SFT, GRPO

Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning

👨‍🎓 Wenlin Zhang · 📧 Xiangyu Zhao · 🏛️ CityUHK
📊 Dataset: General QA, Multi-Hop QA · 🤖 Model: Qwen-2.5-7B · 🎯 Training: DPO

WebDancer: Towards Autonomous Information Seeking Agency

👨‍🎓 Jialong Wu · 📧 Wenbiao Yin, Yong Jiang · 🏛️ Tongyi Lab
📊 Dataset: Complex Task · 🤖 Model: Qwen-2.5-7B / 32B, QwQ-32B · 🎯 Training: DAPO

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

👨‍🎓 Mingyang Chen · 📧 Fan Yang · 🏛️ Baichuan
📊 Dataset: Multi-Hop QA · 🤖 Model: Qwen-2.5-7B / 32B · 🎯 Training: GRPO

🔄 Workflow-based

Search-o1: Agentic Search-Enhanced Large Reasoning Models:

👨‍🎓 Xiaoxi Li · 📧 Zhicheng Dou · 🏛️ GSAI, RUC
📊 Dataset: General QA, Multi-Hop QA, Complex Task, Math & Coding · 🤖 Model: QwQ-32B-Preview

Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research

👨‍🎓 Junde Wu · 📧 Yuyuan Liu · 🏛️ Oxford University
📊 Dataset: Complex Task · 🤖 Model: APIs

Coding Agents with Multimodal Browsing are Generalist Problem Solvers

👨‍🎓 Aditya Bharat Soni · 📧 Graham Neubigo · 🏛️ CMU
📊 Dataset: Complex Task · 🤖 Model: claude-3-7-sonnet

🔧 Tool Using

Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

👨‍🎓 Guanting Dong · 📧 Zhicheng Dou · 🏛️ GSAI, RUC
📊 Dataset: General QA, Multi-Hop QA, Math & Coding · 🤖 Model: Qwen-2.5-3B· 🎯 Training: SFT,GRPO, PPO

OTC: Optimal Tool Calls via Reinforcement Learning

👨‍🎓 Hongru Wang · 📧 Heng Ji · 🏛️ CUHK
📊 Dataset: General QA, Multi-Hop QA, Math & Coding · 🤖 Model: Qwen-2.5-3B / 7B· 🎯 Training: GRPO, PPO

🖼️ Multi-Modal

Multimodal-Search-R1: Incentivizing LMMs to Search

👨‍🎓 Jinming Wu · 📧 Zejun Ma · 🏛️ BUPT
📊 Dataset: VQA · 🤖 Model: Qwen2.5-VL-Instruct-3B/7B · 🎯 Training: GRPO

Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

👨‍🎓 Zhaorui Yang · 📧 Bo Zhang · 🏛️ ZJU
📊 Dataset: Report Generation

📊 Evaluation and Dataset

InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation

👨‍🎓 Yunjia Xi · 📧 Jianghao Lin · 🏛️ SJTU
📊 Dataset: General QA, Multi-Hop QA

BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents

👨‍🎓 Jason Wei · 📧 Amelia Glaese · 🏛️ OpenAI
📊 Dataset: Web Browsing

HealthBench: Evaluating Large Language Models Towards Improved Human Health

👨‍🎓 Rahul K. Arora · 📧 Karan Singhal · 🏛️ OpenAI
📊 Dataset: Multi-turn Medical QA

ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework

👨‍🎓 Lisheng Huang · 📧 Wayne Xin Zhao · 🏛️ GSAI, RUC
📊 Dataset: Web Browsing

WebWalker: Benchmarking LLMs in Web Traversal

👨‍🎓 Jialong Wu · 📧 Deyu Zhou, Yong Jiang · 🏛️ SEU, Tongyi Lab
📊 Dataset: Web Browsing

📚 Perspective and Survey

Agentic Information Retrieval

👨‍🎓 Weinan Zhang · 🏛️ SJTU

🏢 Industry Solutions

OpenAI's Deep Research: https://openai.com/index/introducing-deep-research/

Google's Gemini Pro: https://www.google.com/search/about/

X's Grok 3: https://x.ai/news/grok-3

Perplexity: https://www.perplexity.ai/

Jina AI: https://jina.ai/deepsearch/

Metasota: https://metaso.cn/

🎮 Demo

We are building a demo page to showcase different agentic search methods and allow hands-on exploration of their capabilities. Each demo will be integrated into a standardized retrieval and web browser interface with comparable settings, enabling comprehensive and fair comparisons across various approaches. This systematic evaluation will help identify strengths and limitations of different methods and advance the state-of-the-art in agentic search.

Currently, it looks like this:

You can run the demo by serving the models via vllm:

vllm serve path_to_your_model --port 25900 --host 127.0.0.1

Then, build a search server, for example, use this:

bash retrieval_launch.sh

Config your serve address in config/demo_config.json, modify model list here.

Run the demo by:

streamlit run demo/app.py

📝 Slides

We maintain a collection of 📊 paper presentation slides on Overleaf to facilitate learning and knowledge sharing in the agentic search community. Each presentation consists of 3-5 slides that concisely introduce key aspects of a paper, including motivation, methodology, and main results. These slides serve as quick references for understanding important works in the field and can be used for self-study, teaching, or research presentations.

🔗 Check out our slides collection: Agentic Search Paper Slides

🏆 Arena

We are building an arena page to benchmark different agentic search methods in a unified evaluation framework. All methods will be integrated into standardized retrieval and web browser interfaces with comparable settings, enabling comprehensive and fair comparisons across various approaches. This systematic evaluation will help identify strengths and limitations of different methods and advance the state-of-the-art in agentic search.

🏋️ Gym

We are organizing a collection of optimization frameworks and training approaches used in agentic search, including reinforcement learning methods like GRPO and PPO, as well as supervised fine-tuning techniques. This will help researchers understand and implement effective training strategies for their agentic search models.

Stay tuned for detailed tutorials and code examples on training agentic search systems!

🤝 Contributing

We welcome contributions to this repository! If you have any suggestions or feedback, please feel free to open an issue or submit a pull request.

📖 Citation

If you find this repository useful, please consider citing it as follows:

@misc{awesome-agentic-search,
  author = {Hongjin Qian, Zheng Liu},
  title = {Awesome Agentic Search},
  year = {2025},
  publisher = {GitHub},

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
config		config
demo		demo
figure		figure
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔍 Awesome Agentic Search

🎯 Objectives | 📚 Papers | 📊 Slides | 🎮 Demo | 🏆 Arena | 🏋️ Gym

🎯 Objectives

📚 Papers

🎓 Training-based

🔄 Workflow-based

🔧 Tool Using

🖼️ Multi-Modal

📊 Evaluation and Dataset

📚 Perspective and Survey

🏢 Industry Solutions

🎮 Demo

📝 Slides

🏆 Arena

🏋️ Gym

🤝 Contributing

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

License

qhjqhj00/awesome-agentic-search

Folders and files

Latest commit

History

Repository files navigation

🔍 Awesome Agentic Search

🎯 Objectives | 📚 Papers | 📊 Slides | 🎮 Demo | 🏆 Arena | 🏋️ Gym

🎯 Objectives

📚 Papers

🎓 Training-based

🔄 Workflow-based

🔧 Tool Using

🖼️ Multi-Modal

📊 Evaluation and Dataset

📚 Perspective and Survey

🏢 Industry Solutions

🎮 Demo

📝 Slides

🏆 Arena

🏋️ Gym

🤝 Contributing

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Packages