Highlights
- Pro
Stars
GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.
This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…
Interactive visualizations of the geometric intuition behind diffusion models.
Monitor browser logs directly from Cursor and other MCP compatible IDEs.
Official repository for "AM-RADIO: Reduce All Domains Into One"
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monito…
Collection of scripts to build small-scale datasets for fine-tuning video generation models.
[ICCV 2025] OminiControl: Minimal and Universal Control for Diffusion Transformer
Fully open reproduction of DeepSeek-R1
Open-source benchmark suite for cloud microservices
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
chrome extension for renaming tabs showing paper-pdfs from common providers
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
VueICL: Entity-Aware Video Understanding and Reasoning via In-Context Learning of Visual Annotations
BLOCK (AAAI 2019), with a multimodal fusion library for deep learning models
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
A high-throughput and memory-efficient inference and serving engine for LLMs
Integrating histology and spatial transcriptomics - NeurIPS 2024
This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation
Collection of AWESOME vision-language models for vision tasks