8000 bsdcfp / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View bsdcfp's full-sized avatar

Block or report bsdcfp

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 10,278 782 Updated Dec 4, 2024

Radial Attention Official Implementation

Python 297 12 Updated Jul 6, 2025

Taming Stable Diffusion for Lip Sync!

Python 4,504 716 Updated Jun 20, 2025

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 413 56 Updated Jul 2, 2025

Task management for the Obsidian knowledge base.

TypeScript 2,985 283 Updated Jul 7, 2025

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,655 381 Updated Apr 1, 2025

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 741 33 Updated May 17, 2025

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 4,188 350 Updated Jan 13, 2025

Fast Automatic License Plate Recognition (ALPR) framework.

Python 154 41 Updated Jul 1, 2025

yolov5 车牌检测 车牌识别 中文车牌识别 检测 支持12种中文车牌 支持双层车牌

Python 1,617 267 Updated Nov 25, 2024

Tile primitives for speedy kernels

Cuda 2,500 160 Updated Jul 7, 2025

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 1,533 92 Updated Jul 7, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,650 3,322 Updated Jul 7, 2025

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,575 98 Updated Feb 16, 2024

Development repository for the Triton language and compiler

MLIR 16,067 2,098 Updated Jul 8, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,531 447 Updated Jul 8, 2025

VGDFR: Diffuison-based Video Generation with Dynamic Frame Rate

Python 11 Updated May 16, 2025

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

C++ 538 44 Updated Jul 2, 2025

Lets make video diffusion practical!

Python 14,931 1,353 Updated Jun 27, 2025

Efficient Triton Kernels for LLM Training

Python 5,318 366 Updated Jul 7, 2025

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).

Python 365 13 Updated May 29, 2025

https://wavespeed.ai/ [WIP] The all in one inference optimization solution for ComfyUI, universal, flexible, and fast.

Python 1,089 49 Updated Mar 27, 2025

https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Python 1,274 79 Updated Mar 27, 2025

Helpful tools and examples for working with flex-attention

Python 865 54 Updated Jun 23, 2025

SD.Next: All-in-one WebUI for AI generative image and video creation

Python 6,440 488 Updated Jul 8, 2025

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook 549 101 Updated Jul 8, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 2,869 209 Updated Jul 8, 2025

[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Python 167 8 Updated Mar 1, 2025

Official PyTorch Implementation of "Optimal Stepsize for Diffusion Sampling".

Python 178 11 Updated Apr 13, 2025
Next
0