Starred repositories
Stable Diffusion web UI
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Developer-first error tracking and performance monitoring
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
A Deep Learning based project for colorizing and restoring old images (and video!)
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Augmented Traffic Control: A tool to simulate network conditions
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios