Highlights
- Pro
Stars
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Lightweight coding agent that runs in your terminal
Official PyTorch Implementation of Opt-CWM: Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals.
Official repostory of the paper: Masked Scene Modeling (CVPR 2025)
A simple training-free approach adapting DUSt3R for dynamic scenes.
Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video” (ECCV 2024)
Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
This is the implementation of our ECCV 2024 paper "Unsupervised Dense Prediction using Differentiable Normalized Cuts" by Yanbin Liu and Stephen Gould.
Official code for "DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut", NeurIPS 2024
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
Acceptance rates for the major AI conferences
Compute point cloud geometric features from python
Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
MultiMAE: Multi-modal Multi-task Masked Autoencoders, ECCV 2022
PyTorch implementation of R-MAE https//arxiv.org/abs/2306.05411
(ICCV'23) Learning to Upsample by Learning to Sample
DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an expe…
Official implementation of MOST: Multiple object localization with self-supervised transformers published at ICCV 2023
Curated list of awesome works on unsupervised object localization in 2D images.
Official repository for the ICCV2023 paper "Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV"
The official github repo for "Test-Time Training with Masked Autoencoders"
Python package to corrupt arbitrary images.
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
merantix-momentum / stego-studies
Forked from mhamilton723/STEGOFollow-up Studies on STEGO