Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations

Source codes built by Yoonna Jang, Suhyune Son, Jeongwoo Lee, Junyoung Son

Official codes for the paper:
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations, accepted @EMNLP 2023.

Despite the striking advances in recent language generation performance, model-generated responses have suffered from the chronic problem of hallucinations that are either untrue or unfaithful to a given source. Especially in the task of knowledge grounded conversation, the models are required to generate informative responses, but hallucinated utterances lead to miscommunication. In particular, entity-level hallucination that causes critical misinformation and undesirable conversation is one of the major concerns. To address this issue, we propose a post-hoc refinement method called REM. It aims to enhance the quality and faithfulness of hallucinated utterances by refining them based on the source knowledge. If the generated utterance has a low source-faithfulness score with the given knowledge, REM mines the key entities in the knowledge and implicitly uses them for refining the utterances. We verify that our method reduces entity hallucination in the utterance. Also, we show the adaptability and efficacy of REM with extensive experiments and generative results.

Setting Environment

We trained the models under the setting of python==3.8 and torch==1.9.0, with one RTX8000 GPU. Thanks to open source libraries, such as transformers, pytorch-lightning, wandb we built our code on their codes. We also use DAE and Distinct-N metrics, and we thank the authors for releasing the codes.

1.Make a virtual environment

$ conda create -n REM python=3.8 -y

2.Install `pytorch==1.9.0` according to your CUDA version. (Please see the documentation)

3.Install the required libraries.

$ pip install -r requirements.txt

4.Download DAE (dae_w_syn_hallu) model checkpoint

Place the downloaded model in the directory REM/metrics/dae_factuality/model/.

As DAE relies on Stanford CoreNLP, the code below should be run in stanford-corenlp folder (Please refer the documentation for help.):

$ nohup java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer &

5.Download data

📦REM
┣ 📂data
┃ ┗ 📜train.json
┃ ┗ 📜valid.json
┣ 📂metrics
┃ ┗ 📜distincN.py
┃ ┗ 📂dae_factuality
┃   ┗ 📂model
┃     ┗ 📂dae_w_syn_hallu
┣ 📂src
┣ 📜README.md
┗ 📜requirements.txt

Training models

Uncomment the command lines in the train_model.sh file, to start training the model.

$ sh train_model.sh

Evaluation

Uncomment the command lines in the eval_model.sh file, to evaluate the model on the test set.

$ sh eval_model.sh

Inference

Uncomment the command lines in the inference.sh file, to generate utterances with the trained models.

$ sh inference.sh

Citation

To use our data or source code, please cite our paper:

@inproceedings{jang2023post,
  title={Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations},
  author={Jang, Yoonna and Son, Suhyune and Lee, Jeongwoo and Son, Junyoung and Hur, Yuna and Lim, Jungwoo and Moon, Hyeonseok and Yang, Kisu and Lim, Heui-Seok},
  booktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
  pages={4844--4861},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
metrics		metrics
src		src
.gitignore		.gitignore
README.md		README.md
rem_ex.png		rem_ex.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations

Official codes for the paper:
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations, accepted @EMNLP 2023.

Setting Environment

1.Make a virtual environment

2.Install `pytorch==1.9.0` according to your CUDA version. (Please see the documentation)

3.Install the required libraries.

4.Download DAE (dae_w_syn_hallu) model checkpoint

5.Download data

Training models

Evaluation

Inference

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

YOONNAJANG/REM

Folders and files

Latest commit

History

Repository files navigation

Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations

Official codes for the paper: Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations, accepted @EMNLP 2023.

Setting Environment

1.Make a virtual environment

2.Install pytorch==1.9.0 according to your CUDA version. (Please see the documentation)

3.Install the required libraries.

4.Download DAE (dae_w_syn_hallu) model checkpoint

5.Download data

Training models

Evaluation

Inference

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Official codes for the paper:
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations, accepted @EMNLP 2023.

2.Install `pytorch==1.9.0` according to your CUDA version. (Please see the documentation)

Packages