FictionalQA

Paper: A Fictional Q&A Dataset for Studying Memorization and Knowledge Acquisition
Datasets Collection: tomg-group-umd/FictionalQA

Updates:

(6/18/25) Initial dataset generation code release.
(6/5/25) Paper posted to ArXiv.

About

The FictionalQA dataset is a dataset specifically created to empower researchers to study the dual processes of fact memorization and verbatim sequence memorization. The dataset consists of synthetically-generated, webtext-like documents about fictional events and various facts they entail, as well as question-answer pairs about the facts within the fictional documents.

Work in progress 👷🛠️

This repository contains a refactored version the dataset construction code used to produce the dataset for the paper.

We are working on demo and usage instructions, so check back soon!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
batch_prompts		batch_prompts
batch_results		batch_results
ds_readmes		ds_readmes
etl_nbs_and_scripts		etl_nbs_and_scripts
intermediate_results		intermediate_results
lm_eval/tasks/fictional_qa		lm_eval/tasks/fictional_qa
paper_figures		paper_figures
pipeline		pipeline
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FictionalQA

Updates:

About

Work in progress 👷🛠️

About

Uh oh!

Releases

Packages

Contributors 3

Languages

License

jwkirchenbauer/fictionalqa

Folders and files

Latest commit

History

Repository files navigation

FictionalQA

Updates:

About

Work in progress 👷🛠️

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages