An attempt to implement the TTT paper but by making the inner model linear attention + MLP instead of doing linear attention over grads, i.e. let torch autograd do the thing.
-
Notifications
You must be signed in to change notification settings - Fork 0
sentialx/shiTTT
About
Naive attempt at implementing TTT paper by letting autograd do the heavy lifting
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published