R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

R-Search is a novel reinforcement learning framework for reasoning–search integration. It enables LLMs to autonomously perform multi-step reasoning with deep search interaction, and to learn optimal reasoning–search trajectories via multi-reward signals, substantially improving performance on complex logic- and knowledge-intensive tasks.

We release our trained R-Search models and datasets on Hugging Face.

⚙️ Environment Setup

conda create -n Rsearch python=3.10
conda activate Rsearch
pip install torch==2.4.0
pip install -e .

If you wish to use a local retriever, please additionally run:

conda create -n retrieval python=3.10
conda activate retrieval
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
pip install -r requirements_retri.txt

📦 Data Preparation

You can download our standardized datasets (including corpus, training, and evaluation sets) by running:

bash scripts/download_data.sh

The data will be saved in the data/ directory. If you only wish to run training, you only need to download data/corpus/2wikimultihopqa/train.json.

For the nq, popqa, triviaqa, and bamboogle corpora, we follow FlashRAG and use the Wiki-2018 Corpus. Due to its large size, please download it separately from FlashRAG.

🔍 Retriever Setup

In our experiments, we use a local retriever as the default search engine, employing e5-base-v2 as the dense retriever. For training, only the 2wikimultihopqa training set and its retrieval service are required.

To set up the Index and Retrieval service, follow these steps:

First, activate the retrieval environment:

conda activate retrieval

1. Create Index

For a local dense retriever:

bash scripts/build_index.sh --corpus 2wikimultihopqa --retriever_name e5

For a local sparse retriever (BM25):

bash scripts/build_index.sh --corpus 2wikimultihopqa --retriever_name bm25

You can set --corpus to: 2wikimultihopqa, hotpotqa, musique, or wiki-18.

2. Launch Local Retrieval Server

For Dense Retriever:

bash scripts/retrieval_launch.sh --corpus 2wikimultihopqa --port 8000

For Sparse Retriever:

bash scripts/retrieval_launch_bm25.sh --corpus wiki-18 --port 8001

Your LLM can access the search engine via the HTTP API, e.g., http://127.0.0.1:8000/retrieve_2wikimultihopqa.

To use an online search engine (Google), run:

bash retrieval_launch_google.sh

🧾 Evidence Evaluation Setup

Evidence evaluation during training uses Llama-3.2-3B-Instruct as the verifier, providing one of the reward signals.

To launch the evidence server, execute:

conda activate Rsearch
bash scripts/evidence_server.sh

🖥️ R-Search Training

Before training, please ensure the retrieval and evidence servers are running.

Build Training Data

conda activate Rsearch
bash scripts/pre_train_data.sh

Run RL Training (example with Qwen2.5-7B):

bash scripts/train_grpo_7b.sh
bash scripts/train_ppo_7b.sh

📊 Evaluation

cd src/eval
conda activate Rsearch
CUDA_VISIBLE_DEVICES=2 python main.py --method R-Search --model R-Search-3b-grpo --dataset nq

Configure URLs, model checkpoints, and prompts in src/eval/config.yaml.
For nq, popqa, triviaqa, bamboogle: ensure the retrieval server is running on the wiki-18 corpus.
For 2wikimultihopqa, hotpotqa, and musique: ensure the retrieval server is running on the corresponding corpus.

Evaluation results and output files will be saved in src/eval/output/.

📊 Main Results

🙏 Acknowledgements

The concept of R-Search is inspired by Deepseek-R1. Its implementation builds upon veRL and Search-R1.

We sincerely appreciate these teams for their outstanding contributions to open-source research and development.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_retri.txt		requirements_retri.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

⚙️ Environment Setup

📦 Data Preparation

🔍 Retriever Setup

1. Create Index

2. Launch Local Retrieval Server

🧾 Evidence Evaluation Setup

🖥️ R-Search Training

📊 Evaluation

📊 Main Results

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

QingFei1/R-Search

Folders and files

Latest commit

History

Repository files navigation

R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

⚙️ Environment Setup

📦 Data Preparation

🔍 Retriever Setup

1. Create Index

2. Launch Local Retrieval Server

🧾 Evidence Evaluation Setup

🖥️ R-Search Training

📊 Evaluation

📊 Main Results

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages