AQLprofile is an open source library that enables advanced GPU profiling and tracing on AMD platforms. It works in conjunction with rocprofiler-sdk to support profiling methods such as performance counters (PMC) and SQ thread trace (SQTT). AQLprofile provides the foundational mechanisms for constructing AQL packets and managing profiling operations across multiple AMD GPU architecture families.
AQLprofile builds on concepts from the Heterogeneous System Architecture (HSA) and Architected Queuing Language (AQL), which define the foundations for GPU command processing and profiling on AMD platforms. For further reading:
AQLprofile is a companion library to rocprofiler-sdk. It provides the low-level mechanisms required by rocprofiler-sdk to enable advanced GPU profiling and tracing capabilities on AMD platforms. The development and evolution of AQLprofile are closely aligned with the needs of rocprofiler-sdk, ensuring compatibility and feature support for new GPU architectures and profiling requirements.
AQLprofile abstracts the complexity of constructing and managing AQL (Architected Queuing Language) packets, command buffers, and register programming. These components are essential for orchestrating profiling operations such as performance counter collection and thread tracing. The library supports a range of AMD GPU architecture families such as GFX9, GFX10, GFX11, GFX12 and so on. It provides the necessary infrastructure for rocprofiler-sdk to interact with hardware-level profiling features.
- Profiling AQL packets for GPU workloads.
- Performance counters (PMC) and SQ thread traces (SQTT).
- Support for GFX9, GFX10, GFX11 and GFX12 architecture families.
- Verbose tracing and error logging capabilities.
- Thread trace binary data generated by AQLprofile can be decoded using rocprof-trace-decoder.
The AQLprofile library supports profiling and tracing GPU workloads across multiple architectures.
Below is a summary of the counter blocks supported for each architecture:
Counter Block Name | GFX9 | GFX908 | GFX90A | GFX942 | GFX10 | GFX11 | GFX12 |
---|---|---|---|---|---|---|---|
ATC | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ |
ATC_L2 | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
CHA | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
CHC | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
CPC | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
CPF | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
CPG | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
GCEA | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
GCR | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
GDS | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
GL1A | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
GL1C | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
GL2A | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
GL2C | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
GRBM | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
GRBMH | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
GRBM_SE | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
GUS | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
MC_VM_L2 | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
RPB | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
SDMA | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
SPI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
SQ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
SQ_CS | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
SX | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
TA | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
TCA | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
TCC | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
TCP | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
TD | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
Legend:
- ✅: Supported
- ❌: Not Supported
Ensure the following tools and dependencies are installed:
- ROCm stack
rocm-llvm-dev
(required to build tests)
You can build AQLprofile using either the provided build script (recommended for most users) or by manually invoking CMake for custom builds.
This will configure and build the project with default settings:
./build.sh
For more control over the build process, you can set CMake options manually:
# Set the CMAKE_PREFIX_PATH to point to hsa-runtime includes path and hsa-runtime library path
export CMAKE_PREFIX_PATH=<path to hsa-runtime includes>:<path to hsa-runtime library>
# For example, if ROCm is installed at /opt/rocm:
# export CMAKE_PREFIX_PATH=/opt/rocm/lib:/opt/rocm/include/hsa
export CMAKE_BUILD_TYPE=<debug|release> # release by default
cd /path/to/aqlprofile
mkdir build
cd build
cmake ..
make -j
To enable debug tracing, set the following environment variable before running CMake:
export CMAKE_DEBUG_TRACE=1
This enables verbose debug output of the command packets while this library executes
After building, install the AQLprofile libraries with:
cd build
sudo make install
For issues or questions, please report them in the GitHub Issues section or contact AMD support at dl.ROCm-Profiler.support@amd.com.
AQLprofile is open source and distributed under the MIT License. See the LICENSE file for more details.