8000 Releases · woct0rdho/SageAttention · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Releases: woct0rdho/SageAttention

v2.1.1-windows

25 Mar 02:48
Compare
Choose a tag to compare

First, note that if you just pip install sageattention, that's actually SageAttention 1, which uses Triton and no CUDA and is easy to install.

Here is SageAttention 2, which has both Triton and CUDA kernels, and can be faster than SageAttention 1 in many cases.

Both SageAttention 1 and 2 only support RTX 30xx and newer GPUs (sm >= 80). RTX 20xx and older are not supported.

Installation

  1. Know how to use pip to install packages in the correct Python environment. See https://github.com/woct0rdho/triton-windows
  2. Install triton-windows
  3. Install the wheel here
    • Choose the wheel for your PyTorch version. For example, 'torch2.7.0' in the filename
      • The torch minor version (2.6/2.7 ...) must be correct, but the patch version (2.7.0/2.7.1 ...) can be different
      • The CUDA version can be different, because SageAttention does not yet use any breaking API.
    • For torch 2.8, the nightly wheels are unstable, so the SageAttention wheels here may not work with the torch wheel on any day. They're only tested with torch 2.8.0.dev20250415
    • Choose the wheel for your Python version. For example, 'cp312' in the filename means Python 3.12

Please help test it on all kinds of GPUs. If you see errors, please open an issue at https://github.com/woct0rdho/SageAttention/issues

Recently we've simplified the installation by a lot. There is no need to install Visual Studio or CUDA toolkit to use Triton and SageAttention (unless you want to step into the world of building from source)

To use SageAttention in ComfyUI, you just need to add --use-sage-attention when starting ComfyUI. The PatchSageAttentionKJ node is usually not needed, and not compatible with all workflows.

Dev notes

0