Stars
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
MLNLP: Paper Picture Writing Code
Finetuning DINOv2 (https://github.com/facebookresearch/dinov2) on your own dataset
Python package for retrieving current and historical photos from Google Street View
Official code for CVPR 2022 paper "Rethinking Visual Geo-localization for Large-Scale Applications"
Cooperative Driving Dataset: a dataset for multi-agent driving scenarios
awesome-autonomous-driving
AnyLoc: Universal Visual Place Recognition (RA-L 2023)
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
The C++ Implementation of XFeat (Accelerated Features).
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
一款简单好用的 跨平台/多语言的 相似向量/相似词/相似句 高性能检索引擎。欢迎star & fork。Build together! Power another !
A library for efficient similarity search and clustering of dense vectors.
PyTorch code and models for the DINOv2 self-supervised learning method.
[IEEE Sensors Journal (JSEN) ] SuperVINS: A Real-Time Visual-Inertial SLAM Framework for Challenging Imaging Conditions (integrated deep learning features)
[ECCV 2024] About The official implementation of the paper "Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network“.
Segment Anything combined with CLIP
Official PyTorch implementation of FB-BEV & FB-OCC - Forward-backward view transformation for vision-centric autonomous driving perception
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
[IROS'24 Oral] A Fully Open-source and Compact Aerial Robot with Omnidirectional Visual Perception