Open
Description
Feature Description
Run
python demo/qwen3/demo.py
with batch_size=1 is okay. But I wan't to try another bigger batch_size, like 2, is not right. I try and not to fix it finally. Could I get any help?
After the python side code is adapted for multi-batch, but cuda kernel code still run throught:
mpk complie with report
assert(task_pos == (task_id & 0xffffffff));
failed in /mirage/src/kernel/runtime.cc:958
Thanks the project!
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status