-
Intel
- Munich
- www.matthias.pw
Highlights
- Pro
Stars
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Mobile manipulation research tools for roboticists
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
LAVIS - A One-stop Library for Language-Vision Intelligence
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
OpenBot leverages smartphones as brains for low-cost robots. We have designed a small electric vehicle that costs about $50 and serves as a robot body. Our software stack for Android smartphones su…
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Materials for the Hugging Face Diffusion Models Course
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
Code for Monocular Visual-Inertial Depth Estimation (ICRA 2023)
[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
Metric depth estimation from a single image
deforum / stable-diffusion
Forked from CompVis/stable-diffusionDiffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
A latent text-to-image diffusion model
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
A treasure chest for visual classification and recognition powered by PaddlePaddle
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Instant neural graphics primitives: lightning fast NeRF and more
Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)
FFCV: Fast Forward Computer Vision (and other ML workloads!)
Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Language-Driven Semantic Segmentation