8000 TrustAIRLab repositories · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Change the repository type filter

All

    Repositories list

    • SaferVLM

      Public
      0000Updated Jul 15, 2025Jul 15, 2025
    • T-GPS

      Public
      Python
      0200Updated Jul 13, 2025Jul 13, 2025
    • Python
      0100Updated Jun 24, 2025Jun 24, 2025
    • Python
      67600Updated Jun 8, 2025Jun 8, 2025
    • [ACL2025] Official repository for "Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media"
      Python
      1500Updated May 29, 2025May 29, 2025
    • This is the public code repository for the paper 'Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models'
      Python
      1900Updated May 21, 2025May 21, 2025
    • GPTracker

      Public
      [S&P'25] GPTracker: A Large-Scale Measurement of Misused GPTs
      Python
      0600Updated Apr 2, 2025Apr 2, 2025
    • HateBench

      Public
      [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
      2700Updated Mar 1, 2025Mar 1, 2025
    • [Usenix Security 2025] Synthetic Artifact Auditing: Tracing LLM-Generated Synthetic Data Usage in Downstream Applications
      Python
      0300Updated Jan 29, 2025Jan 29, 2025
    • [Usenix Security 2025] On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
      Python
      0210Updated Jan 29, 2025Jan 29, 2025
    • 0000Updated Jan 28, 2025Jan 28, 2025
    • ModSCAN

      Public
      An official public repository of the paper "ModSCAN: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities" (https://arxiv.org/abs/2410.06967).
      Python
      1200Updated Jan 8, 2025Jan 8, 2025
    • ICL-MIA

      Public
      Python
      0410Updated Dec 19, 2024Dec 19, 2024
    • Python
      0800Updated Dec 18, 2024Dec 18, 2024
    • JavaScript
      0810Updated Oct 30, 2024Oct 30, 2024
    • ZeroFake

      Public
      Python
      11110Updated Oct 30, 2024Oct 30, 2024
    • homepage

      Public
      JavaScript
      0000Updated Oct 14, 2024Oct 14, 2024
    • 0000Updated Aug 28, 2024Aug 28, 2024
    • ML-Doctor

      Public
      Code for ML Doctor
      Python
      0600Updated Aug 14, 2024Aug 14, 2024
    • Code for Voice Jailbreak Attacks Against GPT-4o.
      Python
      13110Updated May 31, 2024May 31, 2024
    • easy-bib

      Public
      TeX
      1501Updated Mar 9, 2024Mar 9, 2024
    • .github

      Public
      0000Updated Feb 28, 2024Feb 28, 2024
    • Python
      0500Updated Feb 23, 2024Feb 23, 2024
    • A dataset consists of 6,387 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 666 jailbreak prompts).
      01200Updated Feb 21, 2024Feb 21, 2024
    • Python
      0200Updated Feb 21, 2024Feb 21, 2024
    • MGTBench

      Public
      Python
      0600Updated Feb 21, 2024Feb 21, 2024
    0