Stereotype Evaluation in LLMs

This repository contains a few scripts for evaluating stereotypes and social bias in language models.

Running scripts requires pip install transformers==4.51.3. OPT implementation is broken in newer 4.52.x versions.

StereoSet

StereoSet is a benchmark designed to evaluate stereotypical bias in language models across categories like gender, race, and profession using intrasentence format. stereoset.py is based on bias-bench implementation.

Models supported include:

causal decoder LMs (OPT, LLaMA, Mistral + any model that can be loaded using transformers.AutoModelForCausalLM)
causal LMs quantized with GPTQModel: https://github.com/ModelCloud/GPTQModel
bert-like encoder models

Usage

python stereoset.py --model_name_or_path "facebook/opt-125m" --persistent_dir "./"

Arguments

Argument	Type	Default	Description
`--persistent_dir`	str	`./`	Directory where persistent data (input/output) will be stored.
`--file_name`	str	`test.json`	Input file name for evaluation data.
`--model_name_or_path`	str	`bert-base-uncased`	HuggingFace model name or path to a pretrained checkpoint.
`--batch_size`	int	`1`	Batch size used during evaluation.
`--seed`	int	`None`	Random seed used for reproducibility and experiment ID.
`--cache_dir`	str	`None`	Directory for cached model files.

SoFA

sofa.py script runs the SoFA (Social Fairness) benchmark to evaluate social bias in language models. Dataset: copenlu/sofa

Usage

python sofa.py --model_name "facebook/opt-125m"

Arguments

Argument	Type	Default	Description
`--model_name`	str	`"gpt2"`	HF model name or path.
`--probe_file`	str	`"data/sofa/SBIC-Pro.csv"`	Path to the CSV file containing probe sentences.
`--identity_file`	str	`"data/sofa/identities_by_category.json"`	JSON file with identity groups used in bias evaluation.
`--batch_size`	int	`512`	Batch size for computing perplexities.
`--max_length`	int	`32`	Maximum input sequence length.
`--gptqmodel`	flag	`False`	Use GPTQ quantized model if set.

HolisticBias

The holistic-bias.py script runs evaluation on HolisticBias v1.1. To run the script, first download the dataset (v1.1):

git clone https://github.com/facebookresearch/ResponsibleNLP.git
cd ResponsibleNLP
pip install .
pip install -r holistic_bias/requirements.txt
pip install numpy==1.26.4  # optional for compatibility
python ./holistic_bias/generate_sentences.py "./data/holistic_bias/" --dataset-version "v1.1"

If run in colab :

import os
os.environ["PYTHONPATH"] = "/content/ResponsibleNLP/:" + os.environ.get("PYTHONPATH", "")

Usage example

python holistic_bias/run_bias_calculation.py \
    --model_name gpt2 \
    --dataset_path ./data/holistic_bias/v1.1/ \
    --output_dir ./results/HolisticBias-Output

Arguments

Argument	Type	Default	Description
`--model_name`	str	`"gpt2"`	HuggingFace model name or path.
`--dataset_path`	str	`"data/holistic_bias/"`	Path to the dataset
`--output_dir`	str	`"results/HolisticBias-Output"`	Output directory to save results.
`--batch_size`	int	`512`	Batch size for perplexity computation.
`--max_length`	int	`32`	Maximum length of input sequences.
`--gptqmodel`	flag	`False`	Use a GPTQ quantized model if set.
`--seed`	int	`42`	Random seed for reproducibility.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
holistic-bias.py		holistic-bias.py
sofa.py		sofa.py
stereoset.py		stereoset.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stereotype Evaluation in LLMs

StereoSet

Usage

Arguments

SoFA

Usage

Arguments

HolisticBias

Usage example

Arguments

About

Uh oh!

Releases

Languages

License

upunaprosk/stereo-evaluation

Folders and files

Latest commit

History

Repository files navigation

Stereotype Evaluation in LLMs

StereoSet

Usage

Arguments

SoFA

Usage

Arguments

HolisticBias

Usage example

Arguments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages