8000 thu-coai repositories · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Change the repository type filter

All

    Repositories list

    • ShieldVLM

      Public
      Python
      0200Updated Jul 6, 2025Jul 6, 2025
    • [ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models
      Python
      MIT License
      01200Updated Jun 18, 2025Jun 18, 2025
    • SPaR

      Public
      Python
      Apache License 2.0
      34700Updated Jun 11, 2025Jun 11, 2025
    • Python
      MIT License
      11300Updated May 27, 2025May 27, 2025
    • [ACL 2025] Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
      Python
      01000Updated May 23, 2025May 23, 2025
    • HPSS

      Public
      HPSS: Heuristic Prompting Strategy Search for LLM Evaluators (ACL 2025 Findings)
      Python
      0200Updated May 23, 2025May 23, 2025
    • Python
      MIT License
      52210Updated May 22, 2025May 22, 2025
    • BARREL

      Public
      Python
      MIT License
      11500Updated May 21, 2025May 21, 2025
    • [ACL'25] SocialEval: Evaluating Social Intelligence of Large Language Models
      MIT License
      0200Updated May 17, 2025May 17, 2025
    • Python
      MIT License
      13810Updated May 15, 2025May 15, 2025
    • AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
      Python
      MIT License
      1018800Updated May 10, 2025May 10, 2025
    • Crisp

      Public
      Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues
      Python
      0800Updated Apr 27, 2025Apr 27, 2025
    • [AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models
      Python
      01000Updated Apr 25, 2025Apr 25, 2025
    • VPO

      Public
      Python
      Apache License 2.0
      11010Updated Mar 26, 2025Mar 26, 2025
    • MAPS

      Public
      Official Implementation of ICLR25 paper "MAPS: Advancing Multi-modal Reasoning in Expert-level Physical Science"
      Python
      1400Updated Mar 12, 2025Mar 12, 2025
    • Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
      Python
      MIT License
      118840Updated Feb 20, 2025Feb 20, 2025
    • [EMNLP'24] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
      Python
      Apache License 2.0
      3646830Updated Jan 7, 2025Jan 7, 2025
    • MiniPLM

      Public
      [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models
      Python
      MIT License
      94940Updated Nov 23, 2024Nov 23, 2024
    • Python
      01710Updated Nov 7, 2024Nov 7, 2024
    • OpenMEVA

      Public
      Benchmark for evaluating open-ended generation
      Python
      75031Updated Nov 6, 2024Nov 6, 2024
    • CodePlan

      Public
      21510Updated Oct 16, 2024Oct 16, 2024
    • ShieldLM

      Public
      ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]
      Python
      MIT License
      920110Updated Sep 29, 2024Sep 29, 2024
    • PICL

      Public
      Code for ACL2023 paper: Pre-Training to Learn in Context
      Python
      MIT License
      410711Updated Jul 26, 2024Jul 26, 2024
    • PsyQA

      Public
      一个中文心理健康支持问答数据集,提供了丰富的援助策略标注。可用于生成富有援助策略的长咨询文本。
      1722000Updated Jul 21, 2024Jul 21, 2024
    • [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
      Python
      12600Updated Jul 9, 2024Jul 9, 2024
    • Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
      Python
      12930Updated Jul 9, 2024Jul 9, 2024
    • Python
      314460Updated Jul 1, 2024Jul 1, 2024
    • Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
      Python
      MIT License
      14200Updated Jun 25, 2024Jun 25, 2024
    • BPO

      Public
      Python
      Apache License 2.0
      1532410Updated Jun 24, 2024Jun 24, 2024
    • Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]
      Python
      MIT License
      1122851Updated Jun 24, 2024Jun 24, 2024
    0