Stars
YOLOv12: Attention-Centric Real-Time Object Detectors
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Wan: Open and Advanced Large-Scale Video Generative Models
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
High-resolution models for human tasks.
Open-source toolbox for visual fashion analysis based on PyTorch
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Official inference repo for FLUX.1 models
A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV 2024] Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
The official implementation of "Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation" (CVPR2024).
A Collection of BM25 Algorithms in Python
State-of-the-Art Text Embeddings
Header-only C++/python library for fast approximate nearest neighbors
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
Code of "Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference" in ICML'24