Highlights
- Pro
Stars
The power of Claude Code + [Gemini / OpenAI / Grok / OpenRouter / Ollama / Custom Model / All Of The Above] working as one.
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: linyijie.gm@gmail.com yangmouxing@gmail.com qinyang.gm…
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
Official Implementation for Flare-Aware Cross-modal Enhancement for Multi-spectral Vehicle ReID
Code for "ICPL-ReID: Identity-Conditional Prompt Learning for Multi-Spectral Object Re-Identification"
Pytorch Implementation of LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification
[WACV 2025] Python implementation of Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
A curated list of papers on the applications of RWKV in computer vision.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Integrated development environment (IDE), an editor for Smart Scripts (SAI/smart_scripts) for TrinityCore based servers. Cmangos support work in progress. Featuring a 3D view built with OpenGL and …
[ICLR 2025 Spotlight] Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model [CVPR -2025]
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
[Arxiv 2025] DiffV2IR: Visible-to-Infrared Diffusion Model via Vision-Language Understanding
Monocular Lane Detection Based on Deep Learning: A Survey
[CVPR 2025] Official repository for "From Poses to Identity: Training-Free Person Re-Identification via Feature Centralization"
【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models [CVPR 2025]
Official repository for the ICLR 2024 paper "Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition".