Stars
✨✨VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
✨✨Latest Advances on Multimodal Large Language Models
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
A live stream development of RL tunning for LLM agents
Solve Visual Understanding with Reinforced VLMs
A fork to add multimodal model training to open-r1
✨First Open-Source R1-like Video-LLM [2025/02/18]
Qihoo360 / 360-LLaMA-Factory
Forked from hiyouga/LLaMA-Factoryadds Sequence Parallelism into LLaMA-Factory
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
🦜🔗 Build context-aware reasoning applications
Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"
LongQLoRA: Extent Context Length of LLMs Efficiently
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A series of large language models developed by Baichuan Intelligent Technology
Free and Open Source, Distributed, RESTful Search Engine
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
[CVPR2023] The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.