Stars
ComfyUI nodes to use segment-anything-2
SkyReels-A2: Compose anything in video diffusion transformers
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
A novel approach to hunyuan image-to-video sampling
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
Using Low-rank adaptation to quickly fine-tune diffusion models.
LoRA and DoRA from Scratch Implementations
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
CodeNav is an LLM agent that navigates and leverages previously unseen code repositories to solve user queries.
Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields, ICCV'23 (Oral, Best Paper Finalist)
Official code for "HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion"
Implicit Motion Function - (unofficial) Microsoft recreation
Accept Bitcoin payments. Free, open-source & self-hosted, Bitcoin payment processor.
An eventing platform that is distributed in time and space.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
pix2code: Generating Code from a Graphical User Interface Screenshot
Base Docker image to run wine programs in a web browser via noVNC (html5 vnc viewer) + Xvfb + x11vnc
Make any web page a desktop application
A Docker image to provide web VNC interface to access Ubuntu LXDE/LxQT desktop environment.