8000 [Comment] - Default Persistent Kernel Configuration for Different GPU types · Issue #354 · mirage-project/mirage · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[Comment] - Default Persistent Kernel Configuration for Different GPU types #354
Open
@jiazhihao

Description

@jiazhihao

Proposed Plan

A100 GPU with 108 physical SMs

  • grid_dim = (108, 1, 1), block_dim = (128, 1, 1)
  • 96 workers (96 SMs), 48 schedulers (12 SMs)

H100 GPU with 132 physical SMs

  • grid_dim = (132, 1, 1), block_dim = (384, 1, 1)
  • 128 workers (128 SMs), 16 schedulers (4 SMs)
  • Per worker involves 128 threads for producer (TMA) and 256 threads for consumer (tensor cores)

B200 GPU with 160 physical SMs

  • grid_dim = (4, 4, 10), block_dim = (384, 1, 1)
  • 144 workers (144 SMs), 64 schedulers (16 SMs)

H20 GPU with 78 physical SMs

  • grid_dim = (78, 1, 1), block_dim = (384, 1, 1)
  • Option 1: 64 workers (64 SMs), 56 schedulers (14 SMs)
  • Option 2: 72 workers (72 SMs), 24 schedulers (6 SMs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0