Source codes built by Yoonna Jang, Suhyune Son, Jeongwoo Lee, Junyoung Son
Official codes for the paper:
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations, accepted @EMNLP 2023.
Despite the striking advances in recent language generation performance, model-generated responses have suffered from the chronic problem of hallucinations that are either untrue or unfaithful to a given source. Especially in the task of knowledge grounded conversation, the models are required to generate informative responses, but hallucinated utterances lead to miscommunication. In particular, entity-level hallucination that causes critical misinformation and undesirable conversation is one of the major concerns. To address this issue, we propose a post-hoc refinement method called REM. It aims to enhance the quality and faithfulness of hallucinated utterances by refining them based on the source knowledge. If the generated utterance has a low source-faithfulness score with the given knowledge, REM mines the key entities in the knowledge and implicitly uses them for refining the utterances. We verify that our method reduces entity hallucination in the utterance. Also, we show the adaptability and efficacy of REM with extensive experiments and generative results.
We trained the models under the setting of python==3.8
and torch==1.9.0
, with one RTX8000 GPU.
Thanks to open source libraries, such as transformers, pytorch-lightning, wandb we built our code on their codes. We also use DAE and Distinct-N metrics, and we thank the authors for releasing the codes.
$ conda create -n REM python=3.8 -y
2.Install pytorch==1.9.0
according to your CUDA version. (Please see the documentation)
$ pip install -r requirements.txt
4.Download DAE (dae_w_syn_hallu) model checkpoint
Place the downloaded model in the directory REM/metrics/dae_factuality/model/
.
As DAE relies on Stanford CoreNLP, the code below should be run in stanford-corenlp folder (Please refer the documentation for help.):
$ nohup java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer &
📦REM
┣ 📂data
┃ ┗ 📜train.json
┃ ┗ 📜valid.json
┣ 📂metrics
┃ ┗ 📜distincN.py
┃ ┗ 📂dae_factuality
┃ ┗ 📂model
┃ ┗ 📂dae_w_syn_hallu
┣ 📂src
┣ 📜README.md
┗ 📜requirements.txt
Uncomment the command lines in the train_model.sh
file, to start training the model.
$ sh train_model.sh
Uncomment the command lines in the eval_model.sh
file, to evaluate the model on the test set.
$ sh eval_model.sh
Uncomment the command lines in the inference.sh
file, to generate utterances with the trained models.
$ sh inference.sh
To use our data or source code, please cite our paper:
@inproceedings{jang2023post,
title={Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations},
author={Jang, Yoonna and Son, Suhyune and Lee, Jeongwoo and Son, Junyoung and Hur, Yuna and Lim, Jungwoo and Moon, Hyeonseok and Yang, Kisu and Lim, Heui-Seok},
booktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
pages={4844--4861},
year={2023}
}