Stars
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888
Official repo for the paper: Recovering Private Text in Federated Learning of Language Models (in NeurIPS 2022)
GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning, as well as corresponding mitigation strategies.