[Parallelization] - How to compile and run on multiple gpus?

Description

Hi, mirage team!
Awesome work! I am already done with your Qwen3 demo. Now I'm on parallel inference. Could u tell me how I can load my compiled mirage kernel to multiple GPUs? Is it compatible with torch.distributed? And what should I, or what will u do if a model is too large to compile into a mega-kernel if my cuda memory is limited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions