This repository provides the data and code to improve LLM honesty, as reported in this paper:
Parametric Knowledge is Not All You Need: Toward Honest Large Language Models via Retrieval of Pretraining Data
Christopher Adrian Kusuma, Muhammad Reza Qorib, and Hwee Tou Ng.
Findings of the 64th Annual Meeting of the Association for Computational Linguistics (PDF)
Install the code dependencies.
pip install -r requirements.txtWe provide our dataset in the data/TIP-TriviaQA folder and on HuggingFace.
- Training: data/TIP-TriviaQA/train.json
- Validation: data/TIP-TriviaQA/val.json
- Test: data/TIP-TriviaQA/test.json
If you want to recreate the data or extend it to another corpus, you need to index Pythia's training data following the steps below.
# download data
git lfs clone https://huggingface.co/datasets/EleutherAI/pythia_deduped_pile_idxmaps
# optional: to check for file corruption
python data/checksum_shards.py
# combine the data
python data/unshard_memmap.py --input_file ./pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00000-of-00082.bin --num_shards 83 --output_dir ./pythia_pile_idxmaps/
# install Elasticsearch from https://www.elastic.co/downloads/elasticsearch and run the server
# create the indices
./scripts/run_index_token.sh
./scripts/run_index_vector.sh
./scripts/run_find_document_triviaqa.sh
cd data
python3 data/generate_triviaqa_dataset.pyTo reproduce our scores, download our trained models from the responder and answerability_classifier here. If you prefer to train the models yourself, follow the training guidelines below.
We provide the training and inference code for RETAIN under the folders train and evaluation, respectively.
Train both the answerability classifier and the responder agent. After training, you will obtain 10 checkpoints for each agent. Identify the checkpoint with the highest F1 score on the validation set (see logs/ac-sft.out and logs/sft-rag-triviaqa-42.out). Then, merge the LoRA checkpoint with the responder agent to run a vLLM server for the responder.
./scripts/run_train_ac.sh
./scripts/run_train_rag.sh
python3 train/merge.py --base_model EleutherAI/pythia-12b-deduped --peft_models [/path/to/responder/lora/model] --save_name [/path/to/merged/responder/model]python3 train/merge.py --base_model EleutherAI/pythia-12b-deduped --peft_models ./responder/best-checkpoint --save_name ./responder/mergedTo evaluate RETAIN on our dataset, first start a vLLM server to serve the responder agent. Then, run the inference script on the dataset:
# run vLLM on a different terminal
CUDA_VISIBLE_DEVICES=0 vllm serve --model ./responder/merged --port 9000 --served-model-name EleutherAI/pythia-12b-deduped &
# run the evaluation script after vLLM starts successfully
./scripts/run_retain.sh ./answerability_classifier/best-checkpoint ./responder/best-checkpointTo evaluate RETAIN on HoneSet, first start a vLLM server to serve the responder agent. Then, run the inference script on HoneSet:
# run vllm on a different terminal
CUDA_VISIBLE_DEVICES=0 vllm serve --model ./responder/merged --port 9000 --served-model-name EleutherAI/pythia-12b-deduped &
# run the evaluation script after vLLM starts successfully
./scripts/run_honeset.sh ./answerability_classifier/best-checkpoint ./responder/best-checkpointWe provide supporting documents from Pythia’s training data for questions deemed unanswerable under the criteria of Cheng et al. (2024) here
This repository is licensed under the GNU General Public License Version 3 (see LICENSE).