8000 [Bug]: Incorrect MFMA Peak FLOPs Calculations for BF16 and F16 in `gfx941/0200_system-speed-of-light.yaml · Issue #700 · ROCm/rocprofiler-compute · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Bug]: Incorrect MFMA Peak FLOPs Calculations for BF16 and F16 in `gfx941/0200_system-speed-of-light.yaml #700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ajassani opened this issue May 7, 2025 · 1 comment
Assignees
Labels
bug Something isn't working triage

Comments

@ajassani
Copy link
ajassani commented May 7, 2025

Describe the bug

While validating peak throughput calculations on MI300X, I noticed that the MFMA metrics for BF16 and F16 in gfx941/0200_system-speed-of-light.yaml assume 4096 FLOPs per cycle per CU:

peak: ((($max_sclk * $cu_per_gpu) * 4096) / 1000)

This is incorrect. According to the CDNA3 whitepaper, Table 1, the correct peak throughput for BF16 and F16 MFMA is 2048 FLOPs per cycle per CU. The corrected expression should be:

peak: ((($max_sclk * $cu_per_gpu) * 2048) / 1000)

Other MFMA-related metrics such as F8, F32, F64, and I8 appear to follow a similar pattern and may also require review. Let me know how you'd prefer to track those.
Thanks.

Linux Distribution

NA

ROCm Compute Profiler Version

NA

GPU

AMD MI300X

ROCm Version

No response

Cluster name (if applicable)

No response

Reproducer

Shared the code snippet from src

Expected behavior

No response

Relevant log output

Screenshots

No response

Additional Context

No response

@ajassani ajassani added bug Something isn't working triage labels May 7, 2025
@feizheng10
Copy link
Contributor

Thanks for pointing out! Will have a quick look then back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

3 participants
0