This repository contains a few scripts for evaluating stereotypes and social bias in language models.
Running scripts requires pip install transformers==4.51.3
.
OPT implementation is broken in newer 4.52.x versions.
StereoSet is a benchmark designed to evaluate stereotypical bias in language models across categories like gender, race, and profession using intrasentence format.
stereoset.py
is based on bias-bench implementation.
Models supported include:
- causal decoder LMs (OPT, LLaMA, Mistral + any model that can be loaded using
transformers.AutoModelForCausalLM
) - causal LMs quantized with GPTQModel: https://github.com/ModelCloud/GPTQModel
- bert-like encoder models
python stereoset.py --model_name_or_path "facebook/opt-125m" --persistent_dir "./"
Argument | Type | Default | Description |
---|---|---|---|
--persistent_dir |
str | ./ |
Directory where persistent data (input/output) will be stored. |
--file_name |
str | test.json |
Input file name for evaluation data. |
--model_name_or_path |
str | bert-base-uncased |
HuggingFace model name or path to a pretrained checkpoint. |
--batch_size |
int | 1 |
Batch size used during evaluation. |
--seed |
int | None |
Random seed used for reproducibility and experiment ID. |
--cache_dir |
str | None |
Directory for cached model files. |
sofa.py
script runs the SoFA (Social Fairness) benchmark to evaluate social bias in language models.
Dataset: copenlu/sofa
python sofa.py --model_name "facebook/opt-125m"
Argument | Type | Default | Description |
---|---|---|---|
--model_name |
str | "gpt2" |
HF model name or path. |
--probe_file |
str | "data/sofa/SBIC-Pro.csv" |
Path to the CSV file containing probe sentences. |
--identity_file |
str | "data/sofa/identities_by_category.json" |
JSON file with identity groups used in bias evaluation. |
--batch_size |
int | 512 |
Batch size for computing perplexities. |
--max_length |
int | 32 |
Maximum input sequence length. |
--gptqmodel |
flag | False |
Use GPTQ quantized model if set. |
The holistic-bias.py
script runs evaluation on HolisticBias v1.1. To run the script, first download the dataset (v1.1):
git clone https://github.com/facebookresearch/ResponsibleNLP.git
cd ResponsibleNLP
pip install .
pip install -r holistic_bias/requirements.txt
pip install numpy==1.26.4 # optional for compatibility
python ./holistic_bias/generate_sentences.py "./data/holistic_bias/" --dataset-version "v1.1"
If run in colab :
import os
os.environ["PYTHONPATH"] = "/content/ResponsibleNLP/:" + os.environ.get("PYTHONPATH", "")
python holistic_bias/run_bias_calculation.py \
--model_name gpt2 \
--dataset_path ./data/holistic_bias/v1.1/ \
--output_dir ./results/HolisticBias-Output
Argument | Type | Default | Description |
---|---|---|---|
--model_name |
str | "gpt2" |
HuggingFace model name or path. |
--dataset_path |
str | "data/holistic_bias/" |
Path to the dataset |
--output_dir |
str | "results/HolisticBias-Output" |
Output directory to save results. |
--batch_size |
int | 512 |
Batch size for perplexity computation. |
--max_length |
int | 32 |
Maximum length of input sequences. |
--gptqmodel |
flag | False |
Use a GPTQ quantized model if set. |
--seed |
int | 42 |
Random seed for reproducibility. |