Stars
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
Fine-tune Segment-Anything Model with Lightning Fabric.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Official PyTorch Implementation of ICCV 2023 Oral -Cross-Ray Neural Radiance Fields 8000 for Novel-view Synthesis from Unconstrained Image Collections
Official code for VisProg (CVPR 2023 Best Paper!)
Perception toolkit for sim2real training and validation in Unity
A comprehensive list of Implicit Representations, NeRF and 3D Gaussian Splatting papers relating to SLAM/Robotics domain, including papers, videos, codes, and related websites
A curated list of awesome neural radiance fields papers
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin…
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
[CVPR2023] NeRF-RPN: A general framework for object detection in NeRFs
EVA Series: Visual Representation Fantasies from BAAI
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
TensorFlow implementation of the "Prompt-to-Prompt Image Editing with Cross Attention Control" for Stable Diffusion
tianrun-chen / SAM-Adapter-PyTorch
80A3 div>Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
[Image 2 Text Para] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Official implementation of MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (https://arxiv.org/abs/2205.09853)
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"