Open
Description
This github issue tracks the progress for supporting MoE models in MPK. We will focus on the following models initially.
- Qwen3-30B-A3B : https://huggingface.co/Qwen/Qwen3-30B-A3B
- Qwen3-235B-A22B: https://huggingface.co/Qwen/Qwen3-235B-A22B
As a reference, FlashDMoE provides a persistent kernel implementation for fusing MoE and collective communication: https://arxiv.org/pdf/2506.04667
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status