Stars
The official code for ``An Engorgio Prompt Makes Large Language Model Babble on''
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Wan: Open and Advanced Large-Scale Video Generative Models
OWASP Foundation Web Respository
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
[ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations"
"他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Discord server https://discord.gg/HrV52MgSC2 QQ频道 https://pd.qq.com/s/1dwwmkgq4
A generative speech model for daily dialogue.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Generative Models by Stability AI
High-Resolution Image Synthesis with Latent Diffusion Models
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis