8000 GitHub - ydnyshhh/smol-moe: smol-moe is a minimal and modular implementation of a Sparse Mixture of Experts (MoE) transformer in PyTorch.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

smol-moe is a minimal and modular implementation of a Sparse Mixture of Experts (MoE) transformer in PyTorch.

License

Notifications You must be signed in to change notification settings

ydnyshhh/smol-moe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🤖 smol-moe: A Lightweight Mixture of Experts Implementation in PyTorch

smol-moe is a minimal and modular implementation of a Sparse Mixture of Experts (MoE) transformer in PyTorch. It supports dynamic top-k expert routing (with auxiliary load balancing) and can be trained on TinyStories/TinyShakespeare-style datasets.

About

smol-moe is a minimal and modular implementation of a Sparse Mixture of Experts (MoE) transformer in PyTorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0