Preliminary support for NVIDIA Jetson boards #1692

dmitriy-philimonov · 2025-05-03T10:39:26Z

NVIDIA Jetson device is an insdustrial Linux based embedded aarch64 platfrom with powerful builtin GPU, which is used for AI tasks, mostly for CV purposes.

The support is provided via --enable-nvidia-jetson switch in the configure script.

All the source code related to the NVIDIA Jetson is placed in the linux/NvidiaJetson.{h,c} source files and hidden by 'NVIDIA_JETSON' C preprocessor define. So, for x86_64 platforms the source code stays unchanged.

Additional functionality added by this commit:

Fix for the CPU temperature reading. The Jetson device is not supported by libsensors. The CPU has 8 cores with only one CPU temperature sensor for all of them located in the thermal zone file. libsensor might be compiled in or turned off. The additional care was taken to provide successfull build with/without libsensors.
The Jetson GPU Meter was added: current load, frequency and temperature.

== Technical details ==

The code tries to find out the correct sensors during the application startup. As an example, the sensors location for NVIDIA Jetson Orin are the following:

CPU temperature: /sys/devices/virtual/thermal/thermal_zone0/type
GPU temperature: /sys/devices/virtual/thermal/thermal_zone1/type
GPU frequency: /sys/class/devfreq/17000000.gpu/cur_freq
GPU curr load: /sys/class/devfreq/17000000.gpu/device/load

Measure:

The GPU frequency is provided in Hz, shown in MHz.
The CPU/GPU temperatures are provided in Celsius multipled by 1000 (milli Celsius), shown in Cesius

P.S. The GUI shows all temperatures for NVIDIA Jetson with additional precision comparing to the default x86_64 platform.

== NVIDIA Jetson models ==

Tested for NVIDIA Jetson Orin and Xavier boards.

Explorer09 · 2025-05-03T10:55:09Z

I fear the option of --enable-nvidia-jetson will make future board-specific customizations add similar configure options. That would make things unmaintainable.

Explorer09 · 2025-05-03T11:00:17Z

Another problem is the conflict with #1620, which is an attempt to unify the GPU meter structure to one interface.

Explorer09 · 2025-05-03T11:02:03Z

linux/NvidiaJetson.c

+   RichString_appendAscii(out, CRT_colors[METER_VALUE], buffer);
+
+   RichString_appendAscii(out, CRT_colors[METER_TEXT], " temp:");
+   xSnprintf(buffer, sizeof(buffer), "%.1f°C", this->values[JETSON_GPU_TEMP]);


Use CRT_degreeSign rather than hard-code a degree sign here.

Fixed. Additionally supported Fahrenheit.

Explorer09 · 2025-05-03T11:04:31Z

linux/NvidiaJetson.c

+      content[0] = tolower(content[0]);
+      content[1] = tolower(content[1]);
+      content[2] = tolower(content[2]);
+      content[3] = tolower(content[3]);


Why case conversion? Is there any reason for the letter case to vary?

Yeah, even Jetson Xavier and Jetson Orin has different sensor names. NVIDIA breaks backward compatibility here.

dmitriy-philimonov · 2025-05-03T11:20:54Z

Another problem is the conflict with #1620, which is an attempt to unify the GPU meter structure to one interface.

I looked through 'main' branch implementation of the GpuMeter. If I understand correctly, it collects information about the GPU usage from each running process.

NVIDIA Jetson has a different approach - it provides a separate GPU statistics via sysfs / custom nvgpu driver.

Since all the NVIDIA Jetson specific code is hidden under the C define 'NVIDIA_JETSON', there should be no code collisions. Semantically, the switch 'NVIDIA_JETSON' for GPU might turn off all the future code in #1620 and turning on the Jetson specific GPU code (anyway, all the data is already collected by the nvgpu driver).

You could merge the final version of the #1620 first, then I'll figure out how to reuse it correctly, on the next big holidays :)

dmitriy-philimonov · 2025-05-03T11:23:48Z

I fear the option of --enable-nvidia-jetson will make future board-specific customizations add similar configure options. That would make things unmaintainable.

The main purpose of the commit was to minimize the interference with the major code base for the default x86_64 platform. Honestly, I do not want to compile in the nvidia jetson board specific code anywhere else.

What approach would you recommend here?

Explorer09 · 2025-05-03T18:07:35Z

I fear the option of --enable-nvidia-jetson will make future board-specific customizations add similar configure options. That would make things unmaintainable.

The main purpose of the commit was to minimize the interference with the major code base for the default x86_64 platform. Honestly, I do not want to compile in the nvidia jetson board specific code anywhere else.

What approach would you recommend here?

There are two ideas that came in my mind.

The more ideal one: Make the board identifier part of the machine type, so we can have --host=aarch64-nvidiajetson-linux-gnu. But that requires your toolchain to be configured with the same machine type identifier, which is sometimes not feasible.
The less ideal, but easier approach: name the the configure option as --with-board=nvidia_jetson. This assumes that htop would accept patches for additional board customizations, and I don't know the maintainers' attitude on this.

Update: Oh no. Nvidia didn't use a unique machine type for their GCC cross-toolchain. Reference

dmitriy-philimonov · 2025-05-04T16:55:27Z

I fear the option of --enable-nvidia-jetson will make future board-specific customizations add similar configure options. That would make things unmaintainable.

The main purpose of the commit was to minimize the interference with the major code base for the default x86_64 platform. Honestly, I do not want to compile in the nvidia jetson board specific code anywhere else.
What approach would you recommend here?

There are two ideas that came in my mind.

The more ideal one: Make the board identifier part of the machine type, so we can have --host=aarch64-nvidiajetson-linux-gnu. But that requires your toolchain to be configured with the same machine type identifier, which is sometimes not feasible.

The less ideal, but easier approach: name the the configure option as --with-board=nvidia_jetson. This assumes that htop would accept patches for additional board customizations, and I don't know the maintainers' attitude on this.

Update: Oh no. Nvidia didn't use a unique machine type for their GCC cross-toolchain. Reference

@BenBE , as a maintainer, are you agree? If so, I will fix according to the idea №2.

BenBE · 2025-05-04T21:35:25Z

There's some internal discussion still going on. We're still discussing which direction we'd like to move forward in.

NVIDIA Jetson device is an insdustrial Linux based embedded aarch64 platfrom with powerful builtin GPU, which is used for AI tasks, mostly for CV purposes. The support is provided via --enable-nvidia-jetson switch in the configure script. All the source code related to the NVIDIA Jetson is placed in the linux/NvidiaJetson.{h,c} source files and hidden by 'NVIDIA_JETSON' C preprocessor define. So, for x86_64 platforms the source code stays unchanged. Additional functionality added by this commit: 1. Fix for the CPU temperature reading. The Jetson device is not supported by libsensors. The CPU has 8 cores with only one CPU temperature sensor for all of them located in the thermal zone file. libsensor might be compiled in or turned off. The additional care was taken to provide successfull build with/without libsensors. 2. The Jetson GPU Meter was added: current load, frequency and temperature. 3. The exact GPU memory allocated by each process is loaded from the nvgpu kernel driver via sysfs and merged to the LinuxProcess data (field LinuxProcess::gpu_mem). The field "GPU_MEM" visualizes this field. For root user only. 4. Additional filter for processes which use GPU right now via hot key 'g', the help is supplied. For root user only. == Technical details == The code tries to find out the correct sensors during the application startup. As an example, the sensors location for NVIDIA Jetson Orin are the following: - CPU temperature: /sys/devices/virtual/thermal/thermal_zone0/type - GPU temperature: /sys/devices/virtual/thermal/thermal_zone1/type - GPU frequency: /sys/class/devfreq/17000000.gpu/cur_freq - GPU curr load: /sys/class/devfreq/17000000.gpu/device/load Measure: - The GPU frequency is provided in Hz, shown in MHz. - The CPU/GPU temperatures are provided in Celsius multipled by 1000 (milli Celsius), shown in Cesius P.S. The GUI shows all temperatures for NVIDIA Jetson with additional precision comparing to the default x86_64 platform. If htop starts with root privileges (effective user id is 0), the experimental code activates. It reads the fixed sysfs file /sys/kernel/debug/nvmap/iovmm/clients with the following content, e.g.: ``` CLIENT PROCESS PID SIZE user gpu_burn 7979 23525644K user gnome_shell 8119 5800K user Xorg 2651 17876K total 23549320K ``` Unfortunately, the /sys/kernel/debug/* files are allowed to read only for the root user, that's why the restriction applies. The patch also adds a separate field 'GPU_MEM', which reads data from the added LinuxProcess::gpu_mem field. The field stores memory allocated for GPU in kilobytes. It is populated by the function NvidiaJetson_LoadGpuProcessTable (the implementation is located in NvidiaJetson.c), which is called at the end of the function Machine_scanTables. Additionally, the new Action is added: actionToggleGpuFilter, which is activated by 'g' hot key (the help is updated appropriately). The GpuFilter shows only the processes which currently utilize GPU (i.e. highly extended nvmap/iovmm/clients table). It is achieved by the filtering machinery associated with ProcessTable::pidMatchList. The code below constructs GPU_PID_MATCH_LIST hash table, then actionToggleGpuFilter either stores it to the ProcessTable::pidMatchList or restores old value of ProcessTable::pidMatchList. The separate LinuxProcess's PROCESS_FLAG_LINUX_GPU_JETSON (or something ...) flag isn't added for GPU_MEM, because currently the functionality of population LinuxProcess::gpu_mem is shared with the GPU consumers filter construction. So, even if GPU_MEM field is not activated, the filter showing GPU consumers should work. This kind of architecture is chosen intentially since it saves memory for the hash table GPU_PID_MATCH_LIST (which is now actually a set), and therefore increases performance. All other approaches convert GPU_PID_MATCH_LIST to a true key/value storage (key = pid, value = gpu memory allocated) with further merge code. == NVIDIA Jetson models == Tested for NVIDIA Jetson Orin and Xavier boards.

dmitriy-philimonov · 2025-06-13T20:45:25Z

Changes:

Rebased. Honestly, I've left all GPU-related code unchanged. It utilized the different kernel API, might work with nvidia jetson one day, who knows?
Additionally pushed the per process GPU memory allocation functionality right into the LinuxProcess class / GPU_MEM field in main screen. Marked it as experimental, because it works with root privileges only. In short, it reads the special sysfs file inside kernet/debug directory which is published by nvgpu nvidia driver, where the dictionary {pid -> gpu_memory} is published.

Added the Action for this functionality. Pressing 'g' hot key the main screen shows only the processes which uses GPU right now. Having the GPU_MEM field, you see the current GPU load per process. Useful, I guess. Hope, you'll utilize the same approach in your future development.

dmitriy-philimonov · 2025-06-13T20:47:49Z

I've left all the deep details in both: the commit message and the NvidiaJetson.c file. Have a look, please. @BenBE

Finally, with "Jetson GPU" Meter and "g" hot key applied, with GPU_MEM field, the "htop" looks like this:

Explorer09 reviewed May 3, 2025

View reviewed changes

BenBE added the Linux 🐧 Linux related issues label May 3, 2025

dmitriy-philimonov force-pushed the nvidia-jetson branch 2 times, most recently from 53914a0 to 68ddb34 Compare May 3, 2025 15:41

dmitriy-philimonov force-pushed the nvidia-jetson branch 2 times, most recently from f836498 to 0f0f553 Compare June 13, 2025 20:32

dmitriy-philimonov force-pushed the nvidia-jetson branch from 0f0f553 to 109046f Compare June 13, 2025 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Preliminary support for NVIDIA Jetson boards #1692

Preliminary support for NVIDIA Jetson boards #1692

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Preliminary support for NVIDIA Jetson boards #1692

Are you sure you want to change the base?

Preliminary support for NVIDIA Jetson boards #1692

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!