This repository was archived by the owner on Dec 16, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Bias Mitigation and Direction Methods #5130
Merged
Merged
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
79c6c33
added linear and hard debiasers
e23057c
worked on documentation
fcc3d34
committing changes before branch switch
7d00910
committing changes before switching branch
668a513
finished bias direction, linear and hard debiasers, need to write tests
91029ef
finished bias direction test
396b245
Commiting changes before switching branch
a8c22a1
finished hard and linear debiasers
ef6a062
finished OSCaR
2c873cb
bias mitigators tests and bias metrics remaining
d97a526
added bias mitigator tests
8460281
added bias mitigator tests
5a76922
finished tests for bias mitigation methods
85cb107
Merge remote-tracking branch 'origin/main' into arjuns/post-processin…
8e55f28
fixed gpu issues
b42b73a
fixed gpu issues
37d8e33
fixed gpu issues
31b1d2c
resolve issue with count_nonzero not being differentiable
a1f4f2a
merged main into post-processing-debiasing
8000
Apr 21, 2021
36cebe3
added more references
88c083b
Merge branch 'main' of https://github.com/allenai/allennlp into arjun…
7269c1d
Merge branch 'main' into arjuns/post-processing-debiasing
schmmd 24ce58f
Merge branch 'main' into arjuns/post-processing-debiasing
AkshitaB 4495627
responded to Akshita's comments
1182b10
Merge branch 'arjuns/post-processing-debiasing' of https://github.com…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,301 @@ | ||
""" | ||
A suite of differentiable methods to compute the bias direction | ||
or concept subspace representing binary protected variables. | ||
""" | ||
|
||
import torch | ||
import sklearn | ||
import numpy as np | ||
|
||
from allennlp.common.checks import ConfigurationError | ||
|
||
|
||
class BiasDirection: | ||
""" | ||
Parent class for bias direction classes. | ||
|
||
# Parameters | ||
|
||
requires_grad : `bool`, optional (default=`False`) | ||
Option to enable gradient calculation. | ||
""" | ||
|
||
def __init__(self, requires_grad: bool = False): | ||
self.requires_grad = requires_grad | ||
|
||
def _normalize_bias_direction(self, bias_direction: torch.Tensor): | ||
return bias_direction / torch.linalg.norm(bias_direction) | ||
|
||
|
||
class PCABiasDirection(BiasDirection): | ||
""" | ||
PCA-based bias direction. Computes one-dimensional subspace that is the span | ||
of a specific concept (e.g. gender) using PCA. This subspace minimizes the sum of | ||
squared distances from all seed word embeddings. | ||
|
||
!!! Note | ||
It is uncommon to utilize more than one direction to represent a concept. | ||
|
||
Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, | ||
V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). | ||
[VERB: Visualizing and Interpreting Bias Mitigation Techniques for | ||
Word Representations](https://api.semanticscholar.org/CorpusID:233168618). | ||
ArXiv, abs/2104.02797. | ||
""" | ||
|
||
def __call__(self, seed_embeddings: torch.Tensor): | ||
""" | ||
|
||
# Parameters | ||
|
||
!!! Note | ||
In the examples below, we treat gender identity as binary, which does not accurately | ||
characterize gender in real life. | ||
|
||
seed_embeddings : `torch.Tensor` | ||
A tensor of size (batch_size, ..., dim) containing seed word embeddings related to | ||
a concept. For example, if the concept is gender, seed_embeddings could contain embeddings | ||
for words like "man", "king", "brother", "woman", "queen", "sister", etc. | ||
|
||
# Returns | ||
|
||
bias_direction : `torch.Tensor` | ||
A unit tensor of size (dim, ) representing the concept subspace. | ||
""" | ||
|
||
# Some sanity checks | ||
if seed_embeddings.ndim < 2: | ||
raise ConfigurationError("seed_embeddings1 must have at least two dimensions.") | ||
|
||
with torch.set_grad_enabled(self.requires_grad): | ||
# pca_lowrank centers the embeddings by default | ||
# There will be two dimensions when applying PCA to | ||
# definitionally-gendered words: 1) the gender direction, | ||
# 2) all other directions, with the gender direction being principal. | ||
_, _, V = torch.pca_lowrank(seed_embeddings, q=2) | ||
# get top principal component | ||
bias_direction = V[:, 0] | ||
return self._normalize_bias_direction(bias_direction) | ||
|
||
|
||
class PairedPCABiasDirection(BiasDirection): | ||
""" | ||
Paired-PCA-based bias direction. Computes one-dimensional subspace that is the span | ||
of a specific concept (e.g. gender) as the first principle component of the | ||
difference vectors between seed word embedding pairs. | ||
|
||
!!! Note | ||
It is uncommon to utilize more than one direction to represent a concept. | ||
|
||
Based on: T. Bolukbasi, K. W. Chang, J. Zou, V. Saligrama, and A. Kalai. [Man is to | ||
computer programmer as woman is to homemaker? debiasing word embeddings] | ||
(https://api.semanticscholar.org/CorpusID:1704893). | ||
In ACM Transactions of Information Systems, 2016. | ||
|
||
Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, | ||
V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). | ||
[VERB: Visualizing and Interpreting Bias Mitigation Techniques for | ||
Word Representations](https://api.semanticscholar.org/CorpusID:233168618). | ||
ArXiv, abs/2104.02797. | ||
""" | ||
|
||
def __call__(self, seed_embeddings1: torch.Tensor, seed_embeddings2: torch.Tensor): | ||
""" | ||
|
||
# Parameters | ||
|
||
!!! Note | ||
In the examples below, we treat gender identity as binary, which does not accurately | ||
characterize gender in real life. | ||
|
||
seed_embeddings1 : `torch.Tensor` | ||
A tensor of size (batch_size, ..., dim) containing seed word | ||
embeddings related to a concept group. For example, if the concept is gender, | ||
seed_embeddings1 could contain embeddings for linguistically masculine words, e.g. | ||
"man", "king", "brother", etc. | ||
|
||
seed_embeddings2: `torch.Tensor` | ||
A tensor of the same size as seed_embeddings1 containing seed word | ||
embeddings related to a different group for the same concept. For example, | ||
seed_embeddings2 could contain embeddings for linguistically feminine words, e.g. | ||
"woman", "queen", "sister", etc. | ||
|
||
!!! Note | ||
For Paired-PCA, the embeddings at the same positions in each of seed_embeddings1 and | ||
seed_embeddings2 are expected to form seed word pairs. For example, if the concept | ||
is gender, the embeddings for ("man", "woman"), ("king", "queen"), ("brother", "sister"), etc. | ||
should be at the same positions in seed_embeddings1 and seed_embeddings2. | ||
|
||
!!! Note | ||
All tensors are expected to be on the same device. | ||
|
||
# Returns | ||
|
||
bias_direction : `torch.Tensor` | ||
A unit tensor of size (dim, ) representing the concept subspace. | ||
""" | ||
|
||
# Some sanity checks | ||
if seed_embeddings1.size() != seed_embeddings2.size(): | ||
raise ConfigurationError("seed_embeddings1 and seed_embeddings2 must be the same size.") | ||
if seed_embeddings1.ndim < 2: | ||
raise ConfigurationError( | ||
"seed_embeddings1 and seed_embeddings2 must have at least two dimensions." | ||
) | ||
|
||
with torch.set_grad_enabled(self.requires_grad): | ||
paired_embeddings = seed_embeddings1 - seed_embeddings2 | ||
_, _, V = torch.pca_lowrank( | ||
paired_embeddings, | ||
q=min(paired_embeddings.size(0), paired_embeddings.size(1)) - 1, | ||
) | ||
bias_direction = V[:, 0] | ||
return self._normalize_bias_direction(bias_direction) | ||
|
||
|
||
class TwoMeansBiasDirection(BiasDirection): | ||
""" | ||
Two-means bias direction. Computes one-dimensional subspace that is the span | ||
of a specific concept (e.g. gender) as the normalized difference vector of the | ||
averages of seed word embedding sets. | ||
|
||
!!! Note | ||
It is uncommon to utilize more than one direction to represent a concept. | ||
|
||
Based on: Dev, S., & Phillips, J.M. (2019). [Attenuating Bias in Word Vectors] | ||
(https://api.semanticscholar.org/CorpusID:59158788). AISTATS. | ||
|
||
Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, | ||
V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). | ||
[VERB: Visualizing and Interpreting Bias Mitigation Techniques for | ||
Word Representations](https://api.semanticscholar.org/CorpusID:233168618). | ||
ArXiv, abs/2104.02797. | ||
""" | ||
|
||
def __call__(self, seed_embeddings1: torch.Tensor, seed_embeddings2: torch.Tensor): | ||
""" | ||
|
||
# Parameters | ||
|
||
!!! Note | ||
In the examples below, we treat gender identity as binary, which does not accurately | ||
characterize gender in real life. | ||
|
||
seed_embeddings1 : `torch.Tensor` | ||
A tensor of size (embeddings1_batch_size, ..., dim) containing seed word | ||
embeddings related to a specific concept group. For example, if the concept is gender, | ||
seed_embeddings1 could contain embeddings for linguistically masculine words, e.g. | ||
"man", "king", "brother", etc. | ||
seed_embeddings2: `torch.Tensor` | ||
A tensor of size (embeddings2_batch_size, ..., dim) containing seed word | ||
embeddings related to a different group for the same concept. For example, | ||
seed_embeddings2 could contain embeddings for linguistically feminine words, , e.g. | ||
"woman", "queen", "sister", etc. | ||
|
||
!!! Note | ||
seed_embeddings1 and seed_embeddings2 need NOT be the same size. Furthermore, | ||
the embeddings at the same positions in each of seed_embeddings1 and seed_embeddings2 | ||
are NOT expected to form seed word pairs. | ||
|
||
!!! Note | ||
All tensors are expected to be on the same device. | ||
|
||
# Returns | ||
|
||
bias_direction : `torch.Tensor` | ||
A unit tensor of size (dim, ) representing the concept subspace. | ||
""" | ||
# Some sanity checks | ||
if seed_embeddings1.ndim < 2 or seed_embeddings2.ndim < 2: | ||
raise ConfigurationError( | ||
"seed_embeddings1 and seed_embeddings2 must have at least two dimensions." | ||
) | ||
if seed_embeddings1.size(-1) != seed_embeddings2.size(-1): | ||
raise ConfigurationError("All seed embeddings must have same dimensionality.") | ||
|
||
with torch.set_grad_enabled(self.requires_grad): | ||
seed_embeddings1_mean = torch.mean(seed_embeddings1, dim=0) | ||
seed_embeddings2_mean = torch.mean(seed_embeddings2, dim=0) | ||
bias_direction = seed_embeddings1_mean - seed_embeddings2_mean | ||
return self._normalize_bias_direction(bias_direction) | ||
|
||
|
||
class ClassificationNormalBiasDirection(BiasDirection): | ||
""" | ||
Classification normal bias direction. Computes one-dimensional subspace that is the span | ||
of a specific concept (e.g. gender) as the direction perpendicular to the classification | ||
boundary of a linear support vector machine fit to classify seed word embedding sets. | ||
|
||
!!! Note | ||
It is uncommon to utilize more than one direction to represent a concept. | ||
|
||
Based on: Ravfogel, S., Elazar, Y., Gonen, H., Twiton, M., & Goldberg, Y. (2020). | ||
[Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection] | ||
(https://api.semanticscholar.org/CorpusID:215786522). ArXiv, abs/2004.07667. | ||
|
||
Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, | ||
V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). | ||
[VERB: Visualizing and Interpreting Bias Mitigation Techniques for | ||
Word Representations](https://api.semanticscholar.org/CorpusID:233168618). | ||
ArXiv, abs/2104.02797. | ||
""" | ||
|
||
def __init__(self): | ||
super().__init__() | ||
|
||
def __call__(self, seed_embeddings1: torch.Tensor, seed_embeddings2: torch.Tensor): | ||
""" | ||
|
||
# Parameters | ||
|
||
!!! Note | ||
In the examples below, we treat gender identity as binary, which does not accurately | ||
characterize gender in real life. | ||
|
||
seed_embeddings1 : `torch.Tensor` | ||
A tensor of size (embeddings1_batch_size, ..., dim) containing seed word | ||
embeddings related to a specific concept group. For example, if the concept is gender, | ||
seed_embeddings1 could contain embeddings for linguistically masculine words, e.g. | ||
"man", "king", "brother", etc. | ||
seed_embeddings2: `torch.Tensor` | ||
A tensor of size (embeddings2_batch_size, ..., dim) containing seed word | ||
embeddings related to a different group for the same concept. For example, | ||
seed_embeddings2 could contain embeddings for linguistically feminine words, , e.g. | ||
"woman", "queen", "sister", etc. | ||
|
||
!!! Note | ||
seed_embeddings1 and seed_embeddings2 need NOT be the same size. Furthermore, | ||
the embeddings at the same positions in each of seed_embeddings1 and seed_embeddings2 | ||
are NOT expected to form seed word pairs. | ||
|
||
!!! Note | ||
All tensors are expected to be on the same device. | ||
|
||
!!! Note | ||
This bias direction method is NOT differentiable. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we intend to allow users to specify bias direction (and mitigator) methods in config, perhaps we should make "is_differentiable" a field, so that the list of methods which can be used can be obtained programmatically? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is part of the bias mitigators and direction wrappers PR - this PR is just the functional API. |
||
|
||
# Returns | ||
|
||
bias_direction : `torch.Tensor` | ||
A unit tensor of size (dim, ) representing the concept subspace. | ||
""" | ||
|
||
# Some sanity checks | ||
if seed_embeddings1.ndim < 2 or seed_embeddings2.ndim < 2: | ||
raise ConfigurationError( | ||
"seed_embeddings1 and seed_embeddings2 must have at least two dimensions." | ||
) | ||
if seed_embeddings1.size(-1) != seed_embeddings2.size(-1): | ||
raise ConfigurationError("All seed embeddings must have same dimensionality.") | ||
|
||
device = seed_embeddings1.device | ||
seed_embeddings1 = seed_embeddings1.flatten(end_dim=-2).detach().cpu().numpy() | ||
seed_embeddings2 = seed_embeddings2.flatten(end_dim=-2).detach().cpu().numpy() | ||
|
||
X = np.vstack([seed_embeddings1, seed_embeddings2]) | ||
Y = np.concatenate([[0] * seed_embeddings1.shape[0], [1] * seed_embeddings2.shape[0]]) | ||
|
||
classifier = sklearn.svm.SVC(kernel="linear").fit(X, Y) | ||
bias_direction = torch.Tensor(classifier.coef_[0]).to(device) | ||
|
||
return self._normalize_bias_direction(bias_direction) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we set
q=2
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the VERB implementation + paper. I think the intuition behind this is that there will be two dimensions when applying PCA to definitionally-gendered words: 1) the gender direction, 2) all other directions, with the gender direction being principal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment in the file itself