Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.
A list of awesome papers and resources of the intersection of Large Language Models and Evolutionary Computation.
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
A compilation of the best multi-agent papers
The official implementation of our ICLR 2025 paper "One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs".
Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction (ACL2025)
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
An overview of LLMs for cybersecurity.
A python utility package for creating, modifying, and reading LDraw files and data structures.
Set of tools to assess and improve LLM security.
Papers and resources related to the security and privacy of LLMs 🤖
The dataset comprises resumes collected from various sources, including Google Images, Bing Images, and the website LiveCareer. Each resume entry consists of two columns: "Category" and "Text".
Benchmark for structure transformation attacks
A fast + lightweight implementation of the GCG algorithm in PyTorch
Every practical and proposed defense against prompt injection.
NVR with realtime local object detection for IP cameras
Adding guardrails to large language models.
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
A Guardrials Hub validator used to detect if prompt injection is present
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Universal and Transferable Attacks on Aligned Language Models
an awesome list of honeypot resources