8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
make sure rotary positional embeddings are same type as queries
add token shifting
apply rotary embeddings to values instead of using depthwise conv on … …queries
fix rotary positional embedding
go all the way with rotary embeddings
allow for fine control over cancelling and registering kmeans update … …hooks
expose KmeansAttention layer for personal use in other repositories
fix examples
use embedding dimension, due to embedding factorization
allow one to look multiple blocks of window size into the past for lo… …cal attention