8000 GitHub - eachsaj/CtrlA: This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ CtrlA Public
forked from HSLiu-Initial/CtrlA

This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.

License

Notifications You must be signed in to change notification settings

eachsaj/CtrlA

 
 

Repository files navigation

CtrlA logo

CtrlA: Adaptive RAG via Inherent Control

📘 Zhihu Blog • 📚 Social Media • 📝 Arxiv Paper

The official implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.

ctrla

CtrlA introduces an effective inherent control-based adaptive RAG framework, termed CtrlA, to enhance retrieval-augmented generation for LLM, balancing its internal and external knowledge. CtrlA characterizes LLM’s internal states and intervenes in the LLM generation from two perspectives: honesty steering and confidence monitoring via simple yet effective feature direction representations.

News

Oct 7, 2024 🎉 We have updated the version of CtrlA on Arxiv (Version 2).

💉 Installation

Install dependencies by running the command below.

pip install -r requirements.txt

💉 Datasets and Model

The dataset used for training the Confidence and Honesty Probes, as well as for our evaluation, is available here. Please create an eval_data/ directory and place all the data files within it.

Please download the model file from mistralai/Mistral-7B-Instruct-v0.1 on Hugging Face and place it in the model/ directory.

💉 Confidence and Honesty Feature

The pre-trained feature are stored in the trained_probe/ directory.

To extract the features, refer to the train_confidence_probe.ipynb notebook for the confidence feature, and the train_honesty_probe.ipynb notebook for the honesty feature.

💉 Retriever Setup

All the code related to the retriever setup is in the code/retrievers directory. We provide two retrieval services as reported in our paper:

  1. BM25 Retrieval Service using ElasticSearch
  2. BGE Retrieval Service using FAISS

💉 Downloads

  1. Wikipedia 2018 Snippets: wget https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
  2. BGE Embedding Model Weights: https://huggingface.co/BAAI/bge-large-en-v1.5

💉 Retriever Dependencies

  • FAISS : https://github.com/facebookresearch/faiss or https://pypi.org/project/faiss/
  • SentenceTransformers: https://github.com/UKPLab/sentence-transformers
  • Flask
  • PyTorch
  • ElasticSearch

💉 Quick Start to Setup BGE Retrieval Service

cd code/retrievers/bge_retrieval_service  # go to the target directory
python encode_wiki_bge.py  # encode snippets into embeddings
python bge_faiss.py  # set up bge-retrieval service

The sample code to call the bge-retrieval service:

python send_req_bge_wiki.py -q <query> -k <stop_k> --use_prefix

--use_prefix is optional, which appends the prefix Represent this sentence for searching relevant passages: in front of queries for asymmetric encoding of queries and passages

💉 Quick Start to Setup ES (Elasticsearch) Retrieval Service (BM25)

cd code/retrievers/es_retrieval_service  # go to the target directory
python es_dictionary.py  # convert passages in tsv to desired dictionary format.
python es_service.py  # set up Elasticsearch Retrieval Service

The sample code to call the es-retrieval service:

python send_es_req.py -q <query> -k <stop_k>

After deploying the retrieval service, please complete the corresponding retrieval functions in code/retrieval.py.

💉 Evaluation

All the commands can be found in ./run.sh

💉 TriviaQA

python run.py --config configs/run.json --model run_short_form --dataset triviaqa --task triviaqa --max_new_tokens 1024 --retrieve_method bge_serper --metric match --use_tvq

💉 PopQA

python run.py --config configs/run.json --model run_short_form --dataset popqa --task popqa --max_new_tokens 1024 --retrieve_method bge_serper --metric match --use_tvq --continue_gen_without_contents

💉 ASQA

python run.py --config configs/run.json --model run_long_form --dataset asqa --task asqa --max_new_tokens 130 --retrieve_method bge --use_tvq

ALCE/ASQA offers a thorough evaluation of long-form QA using various metrics. To conduct the initial evaluation, you can install the ALCE repository and download the necessary data.

git clone https://github.com/princeton-nlp/ALCE.git
python3 -m alce_env
cd ALCE
bash download_data.sh

💉 Bio Generation

python run.py --config configs/run.json --model run_long_form --dataset fact --task fact --max_new_tokens 300 --retrieve_method bge_serper --use_tvq

Please follow the instructions in the FactScore official repository to set up your environment. Since the original repository is no longer maintained, consider using alternative sources like wj210's fork or armingh2000's FactScoreLite for evaluations. To proceed, use the command below:

python -m factscore.factscorer --data_path <output_file>  --model_name retrieval+ChatGPT --cache_dir <cache_dir> --openai_key <openai_key> --verbose

💉 FreshQA

python run.py --config configs/run.json --model run_long_form --dataset fresh --task fresh --max_new_tokens 1024 --retrieve_method serper --use_tvq

Please follow the instructions provided in the freshllms/freshqa repository, which includes complete data and codes of FreshLLMs, to conduct your evaluation.

💉 Citation

If this work is helpful for you, please kindly cite it as follows:

@misc{liu2024ctrla,
      title={CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control}, 
      author={Huanshuo Liu and Hao Zhang and Zhijiang Guo and Kuicai Dong and Xiangyang Li and Yi Quan Lee and Cong Zhang and Yong Liu},
      year={2024},
      eprint={2405.18727},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

💉 Contact

If you have questions, feel free to send an email to huanshuo.liu[at]u.nus.edu.

About

This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 66.2%
  • Python 33.4%
  • Shell 0.4%
0