8000 GitHub - ROCm/aqlprofile
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ROCm/aqlprofile

Repository files navigation

AQLprofile: Architected Queuing Language Profiling Library

AQLprofile is an open source library that enables advanced GPU profiling and tracing on AMD platforms. It works in conjunction with rocprofiler-sdk to support profiling methods such as performance counters (PMC) and SQ thread trace (SQTT). AQLprofile provides the foundational mechanisms for constructing AQL packets and managing profiling operations across multiple AMD GPU architecture families.

Background

AQLprofile builds on concepts from the Heterogeneous System Architecture (HSA) and Architected Queuing Language (AQL), which define the foundations for GPU command processing and profiling on AMD platforms. For further reading:

Overview

AQLprofile is a companion library to rocprofiler-sdk. It provides the low-level mechanisms required by rocprofiler-sdk to enable advanced GPU profiling and tracing capabilities on AMD platforms. The development and evolution of AQLprofile are closely aligned with the needs of rocprofiler-sdk, ensuring compatibility and feature support for new GPU architectures and profiling requirements.

AQLprofile abstracts the complexity of constructing and managing AQL (Architected Queuing Language) packets, command buffers, and register programming. These components are essential for orchestrating profiling operations such as performance counter collection and thread tracing. The library supports a range of AMD GPU architecture families such as GFX9, GFX10, GFX11, GFX12 and so on. It provides the necessary infrastructure for rocprofiler-sdk to interact with hardware-level profiling features.

Features

  • Profiling AQL packets for GPU workloads.
  • Performance counters (PMC) and SQ thread traces (SQTT).
  • Support for GFX9, GFX10, GFX11 and GFX12 architecture families.
  • Verbose tracing and error logging capabilities.
  • Thread trace binary data generated by AQLprofile can be decoded using rocprof-trace-decoder.

Supported Architectures and Counter Blocks

The AQLprofile library supports profiling and tracing GPU workloads across multiple architectures.
Below is a summary of the counter blocks supported for each architecture:

Counter Block Name GFX9 GFX908 GFX90A GFX942 GFX10 GFX11 GFX12
ATC
ATC_L2
CHA
CHC
CPC
CPF
CPG
GCEA
GCR
GDS
GL1A
GL1C
GL2A
GL2C
GRBM
GRBMH
GRBM_SE
GUS
MC_VM_L2
RPB
SDMA
SPI
SQ
SQ_CS
SX
TA
TCA
TCC
TCP
TD

Legend:

  • ✅: Supported
  • ❌: Not Supported

Build and Installation

Prerequisites

Ensure the following tools and dependencies are installed:

  • ROCm stack
  • rocm-llvm-dev (required to build tests)

Building AQLprofile

You can build AQLprofile using either the provided build script (recommended for most users) or by manually invoking CMake for custom builds.

Option 1: Using the Build Script (Recommended)

This will configure and build the project with default settings:

./build.sh

Option 2: Custom Build with CMake

For more control over the build process, you can set CMake options manually:

# Set the CMAKE_PREFIX_PATH to point to hsa-runtime includes path and hsa-runtime library path
export CMAKE_PREFIX_PATH=<path to hsa-runtime includes>:<path to hsa-runtime library>
# For example, if ROCm is installed at /opt/rocm:
# export CMAKE_PREFIX_PATH=/opt/rocm/lib:/opt/rocm/include/hsa

export CMAKE_BUILD_TYPE=<debug|release> # release by default

cd /path/to/aqlprofile
mkdir build
cd build
cmake ..
make -j

Debug Trace Mode (optional; for debugging only)

To enable debug tracing, set the following environment variable before running CMake:

export CMAKE_DEBUG_TRACE=1

This enables verbose debug output of the command packets while this library executes

Installation

After building, install the AQLprofile libraries with:

cd build
sudo make install

Support

For issues or questions, please report them in the GitHub Issues section or contact AMD support at dl.ROCm-Profiler.support@amd.com.

License

AQLprofile is open source and distributed under the MIT License. See the LICENSE file for more details.

About

No description, website, or topics provided.

Resources

Lice 6279 nse

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 27

Languages

0