-
Notifications
You must be signed in to change notification settings - Fork 160
Insights: fla-org/flash-linear-attention
Overview
-
- 8 Merged pull requests
- 0 Open pull requests
- 3 Closed issues
- 0 New issues
Could not load contribution data
Please try again later
8 Pull requests merged by 4 people
-
[RWKV7] Strictly initialize rwkv7 according to RWKV-LM
#387 merged
May 3, 2025 -
[Utils] Add fused pack/unpack fns
#386 merged
May 2, 2025 -
[GDN]: initializing
A_log
anddt_bias
in_init_weights
#380 merged
May 1, 2025 -
[Tests] Enable tests with
causal_conv1d
on H100 CIs#385 merged
May 1, 2025 -
[PaTH] Add PaTH attention model and kernel
#384 merged
May 1, 2025 -
[Misc.] Update triton-nightly.yml
#382 merged
Apr 30, 2025 -
[RWKV7] Fix convert script
#383 merged
Apr 30, 2025 -
[RWKV7]: Rewrite docs to match Triton codes.
#381 merged
Apr 28, 2025
3 Issues closed by 2 people
-
[Bug] FLA models fail to return logits when labels are not provided
#237 closed
May 4, 2025
8 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[Bug] [RWKV7] `fuse_norm` not working
#370 commented on
May 1, 2025 • 0 new comments -
[RFC] Implement Triton Version of Token Shift and ShortConv with varlen support
#319 commented on
May 3, 2025 • 0 new comments -
[Bug] `GLA` fails when upgrade to `triton-nightly`
#288 commented on
May 4, 2025 • 0 new comments -
DeepSeek initialization underperforms when the learning rate is increased for RoPE-based transformer
#266 commented on
May 4, 2025 • 0 new comments -
[RFC] Make all kernels compatible with Triton3.2 on H100 GPUs
#252 commented on
May 4, 2025 • 0 new comments -
[Misc.] Add activations for non-cuda Backends
#174 commented on
May 2, 2025 • 0 new comments -
[Linear Attention] Update fused_recurrent.py for inference with nomalization=true
#268 commented on
May 2, 2025 • 0 new comments -
Use `tl.exp2` for all gating operations
#361 commented on
May 2, 2025 • 0 new comments