Starred repositories
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
Code release for ActionFormer (ECCV 2022)
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection - CVPR NAS 2023
Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
Dataset of the paper "Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS"
Pytorch implementation for "A Novel Plug-in Module for Fine-Grained Visual Classification". fine-grained visual classification task.
HOTA (and other) evaluation metrics for Multi-Object Tracking (MOT).
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
When do we not need larger vision models?
Dataset pruning for ImageNet and LAION-2B.
Everything about the SmolLM2 and SmolVLM family of models
Pytorch implementation of 'Clothes-Changing Person Re-identification with RGB Modality Only. In CVPR, 2022.'
Official repository for "GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition" (ICIP'21)
Official Code for WACV 2023 paper MEVID: Multi-view Extended Videos with Identities for Video Person Re-Identification
A simple baseline for pedestrian attribute recognition in surveillance scenarios
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
About Awesome things towards foundation agents. Papers / Repos / Blogs / ...