Stars
nvidia-modelopt is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for do…
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Run generative AI models in sophgo BM1684X/BM1688
YOLOv12: Attention-Centric Real-Time Object Detectors
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…
Robust Speech Recognition via Large-Scale Weak Supervision
Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)
High-speed Large Language Model Serving for Local Deployment
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Noise supression using deep filtering
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
A flexible framework powered by ComfyUI for generating personalized Nobel Prize images.
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Retrieval and Retrieval-augmented LLMs
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Official inference repo for FLUX.1 models
[EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models
A pytorch quantization backend for optimum
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.