8000 error with quickstart · Issue #37 · NVIDIA/nvkind · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

error with quickstart #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
geraldstanje opened this issue Mar 20, 2025 · 1 comment
Open

error with quickstart #37

geraldstanje opened this issue Mar 20, 2025 · 1 comment

Comments

@geraldstanje
Copy link
geraldstanje commented Mar 20, 2025

hi,

i run into the following error trying to create a cluster with 1 gpu worker - any idea?
used docs: https://github.com/NVIDIA/nvkind?tab=readme-ov-file#quickstart
i built nvkind from source - with latest main branch.

⚡ main ~/nvkind ./nvkind cluster create
Creating cluster "nvkind-d27vp" ...

 ✓ Ensuring node image (kindest/node:v1.32.2) 🖼
 ✓ Preparing nodes 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-nvkind-d27vp"
You can now use your cluster with:

kubectl cluster-info --context kind-nvkind-d27vp

Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community 🙂
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 Packages [8792 kB]
Get:5 http://deb.debian.org/debian bookworm-updates/main amd64 Packages [13.5 kB]
Get:6 http://deb.debian.org/debian-security bookworm-security/main amd64 Packages [249 kB]
Fetched 9310 kB in 1s (6246 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  dirmngr gnupg gnupg-l10n gnupg-utils gpg-agent gpg-wks-client gpg-wks-server
  gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
Suggested packages:
  dbus-user-session libpam-systemd pinentry-gnome3 tor parcimonie xloadimage
  scdaemon pinentry-doc
The following NEW packages will be installed:
  dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client
  gpg-wks-server gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
0 upgraded, 14 newly installed, 0 to remove and 12 not upgraded.
Need to get 7881 kB of archives.
After this operation, 16.0 MB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main amd64 libassuan0 amd64 2.5.5-5 [48.5 kB]
Get:2 http://deb.debian.org/debian bookworm/main amd64 gpgconf amd64 2.2.40-1.1 [564 kB]
Get:3 http://deb.debian.org/debian bookworm/main amd64 libksba8 amd64 1.6.3-2 [128 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 libnpth0 amd64 1.6-3 [19.0 kB]
Get:5 http://deb.debian.org/debian bookworm/main amd64 dirmngr amd64 2.2.40-1.1 [792 kB]
Get:6 http://deb.debian.org/debian bookworm/main amd64 gnupg-l10n all 2.2.40-1.1 [1093 kB]
Get:7 http://deb.debian.org/debian bookworm/main amd64 gnupg-utils amd64 2.2.40-1.1 [927 kB]
Get:8 http://deb.debian.org/debian bookworm/main amd64 gpg amd64 2.2.40-1.1 [949 kB]
Get:9 http://deb.debian.org/debian bookworm/main amd64 pinentry-curses amd64 1.2.1-1 [77.4 kB]
Get:10 http://deb.debian.org/debian bookworm/main amd64 gpg-agent amd64 2.2.40-1.1 [695 kB]
Get:11 http://deb.debian.org/debian bookworm/main amd64 gpg-wks-client amd64 2.2.40-1.1 [541 kB]
Get:12 http://deb.debian.org/debian bookworm/main amd64 gpg-wks-server amd64 2.2.40-1.1 [531 kB]
Get:13 http://deb.debian.org/debian bookworm/main amd64 gpgsm amd64 2.2.40-1.1 [671 kB]
Get:14 http://deb.debian.org/debian bookworm/main amd64 gnupg all 2.2.40-1.1 [846 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 7881 kB in 0s (97.2 MB/s)
Selecting previously unselected package libassuan0:amd64.
(Reading database ... 9629 files and directories currently installed.)
Preparing to unpack .../00-libassuan0_2.5.5-5_amd64.deb ...
Unpacking libassuan0:amd64 (2.5.5-5) ...
Selecting previously unselected package gpgconf.
Preparing to unpack .../01-gpgconf_2.2.40-1.1_amd64.deb ...
Unpacking gpgconf (2.2.40-1.1) ...
Selecting previously unselected package libksba8:amd64.
Preparing to unpack .../02-libksba8_1.6.3-2_amd64.deb ...
Unpacking libksba8:amd64 (1.6.3-2) ...
Selecting previously unselected package libnpth0:amd64.
Preparing to unpack .../03-libnpth0_1.6-3_amd64.deb ...
Unpacking libnpth0:amd64 (1.6-3) ...
Selecting previously unselected package dirmngr.
Preparing to unpack .../04-dirmngr_2.2.40-1.1_amd64.deb ...
Unpacking dirmngr (2.2.40-1.1) ...
Selecting previously unselected package gnupg-l10n.
Preparing to unpack .../05-gnupg-l10n_2.2.40-1.1_all.deb ...
Unpacking gnupg-l10n (2.2.40-1.1) ...
Selecting previously unselected package gnupg-utils.
Preparing to unpack .../06-gnupg-utils_2.2.40-1.1_amd64.deb ...
Unpacking gnupg-utils (2.2.40-1.1) ...
Selecting previously unselected package gpg.
Preparing to unpack .../07-gpg_2.2.40-1.1_amd64.deb ...
Unpacking gpg (2.2.40-1.1) ...
Selecting previously unselected package pinentry-curses.
Preparing to unpack .../08-pinentry-curses_1.2.1-1_amd64.deb ...
Unpacking pinentry-curses (1.2.1-1) ...
Selecting previously unselected package gpg-agent.
Preparing to unpack .../09-gpg-agent_2.2.40-1.1_amd64.deb ...
Unpacking gpg-agent (2.2.40-1.1) ...
Selecting previously unselected package gpg-wks-client.
Preparing to unpack .../10-gpg-wks-client_2.2.40-1.1_amd64.deb ...
Unpacking gpg-wks-client (2.2.40-1.1) ...
Selecting previously unselected package gpg-wks-server.
Preparing to unpack .../11-gpg-wks-server_2.2.40-1.1_amd64.deb ...
Unpacking gpg-wks-server (2.2.40-1.1) ...
Selecting previously unselected package gpgsm.
Preparing to unpack .../12-gpgsm_2.2.40-1.1_amd64.deb ...
Unpacking gpgsm (2.2.40-1.1) ...
Selecting previously unselected package gnupg.
Preparing to unpack .../13-gnupg_2.2.40-1.1_all.deb ...
Unpacking gnupg (2.2.40-1.1) ...
Setting up libksba8:amd64 (1.6.3-2) ...
Setting up libnpth0:amd64 (1.6-3) ...
Setting up libassuan0:amd64 (2.5.5-5) ...
Setting up gnupg-l10n (2.2.40-1.1) ...
Setting up gpgconf (2.2.40-1.1) ...
Setting up gpg (2.2.40-1.1) ...
Setting up gnupg-utils (2.2.40-1.1) ...
Setting up pinentry-curses (1.2.1-1) ...
Setting up gpg-agent (2.2.40-1.1) ...
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-browser.socket → /usr/lib/systemd/user/gpg-agent-browser.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-extra.socket → /usr/lib/systemd/user/gpg-agent-extra.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-ssh.socket → /usr/lib/systemd/user/gpg-agent-ssh.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent.socket → /usr/lib/systemd/user/gpg-agent.socket.
Setting up gpgsm (2.2.40-1.1) ...
Setting up dirmngr (2.2.40-1.1) ...
Created symlink /etc/systemd/user/sockets.target.wants/dirmngr.socket → /usr/lib/systemd/user/dirmngr.socket.
Setting up gpg-wks-server (2.2.40-1.1) ...
Setting up gpg-wks-client (2.2.40-1.1) ...
Setting up gnupg (2.2.40-1.1) ...
Processing triggers for libc-bin (2.36-9+deb12u9) ...
deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /
#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/experimental/deb/$(ARCH) /
Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm-updates InRelease
Hit:3 http://deb.debian.org/debian-security bookworm-security InRelease
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  InRelease [1477 B]
Get:5 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  Packages [17.7 kB]
Fetched 19.1 kB in 0s (67.1 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
The following NEW packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit
  nvidia-container-toolkit-base
0 upgraded, 4 newly installed, 0 to remove and 12 not upgraded.
Need to get 5843 kB of archives.
After this operation, 27.9 MB of additional disk space will be used.
Get:1 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container1 1.17.5-1 [925 kB]
Get:2 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container-tools 1.17.5-1 [20.2 kB]
Get:3 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit-base 1.17.5-1 [3709 kB]
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit 1.17.5-1 [1189 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 5843 kB in 0s (18.9 MB/s)
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 9884 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.17.5-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.17.5-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.17.5-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.17.5-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.17.5-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.17.5-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.17.5-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.17.5-1) ...
Setting up nvidia-container-toolkit-base (1.17.5-1) ...
Setting up libnvidia-container1:amd64 (1.17.5-1) ...
Setting up libnvidia-container-tools (1.17.5-1) ...
Setting up nvidia-container-toolkit (1.17.5-1) ...
Processing triggers for libc-bin (2.36-9+deb12u9) ...
time="2025-03-20T20:42:47Z" level=info msg="Using config version 3"
time="2025-03-20T20:42:47Z" level=info msg="Using CRI runtime plugin name \"io.containerd.cri.v1.runtime\""
time="2025-03-20T20:42:47Z" level=info msg="Wrote updated config to /etc/containerd/config.toml"
⚡ main ~/nvkind kind delete cluster                                  
Deleting cluster "kind" ...
⚡ main ~/nvkind ./nvkind cluster create                              
Creating cluster "nvkind-4xqdp" ...
 ✓ Ensuring node image (kindest/node:v1.32.2) 🖼
 ✓ Preparing nodes 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-nvkind-4xqdp"
You can now use your cluster with:

kubectl cluster-info --context kind-nvkind-4xqdp

Have a nice day! 👋
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 Packages [8792 kB]
Get:5 http://deb.debian.org/debian bookworm-updates/main amd64 Packages [13.5 kB]
Get:6 http://deb.debian.org/debian-security bookworm-security/main amd64 Packages [249 kB]
Fetched 9310 kB in 2s (5973 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  dirmngr gnupg gnupg-l10n gnupg-utils gpg-agent gpg-wks-client gpg-wks-server
  gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
Suggested packages:
  dbus-user-session libpam-systemd pinentry-gnome3 tor parcimonie xloadimage
  scdaemon pinentry-doc
The following NEW packages will be installed:
  dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client
  gpg-wks-server gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
0 upgraded, 14 newly installed, 0 to remove and 12 not upgraded.
Need to get 7881 kB of archives.
After this operation, 16.0 MB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main amd64 libassuan0 amd64 2.5.5-5 [48.5 kB]
Get:2 http://deb.debian.org/debian bookworm/main amd64 gpgconf amd64 2.2.40-1.1 [564 kB]
Get:3 http://deb.debian.org/debian bookworm/main amd64 libksba8 amd64 1.6.3-2 [128 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 libnpth0 amd64 1.6-3 [19.0 kB]
Get:5 http://deb.debian.org/debian bookworm/main amd64 dirmngr amd64 2.2.40-1.1 [792 kB]
Get:6 http://deb.debian.org/debian bookworm/main amd64 gnupg-l10n all 2.2.40-1.1 [1093 kB]
Get:7 http://deb.debian.org/debian bookworm/main amd64 gnupg-utils amd64 2.2.40-1.1 [927 kB]
Get:8 http://deb.debian.org/debian bookworm/main amd64 gpg amd64 2.2.40-1.1 [949 kB]
Get:9 http://deb.debian.org/debian bookworm/main amd64 pinentry-curses amd64 1.2.1-1 [77.4 kB]
Get:10 http://deb.debian.org/debian bookworm/main amd64 gpg-agent amd64 2.2.40-1.1 [695 kB]
Get:11 http://deb.debian.org/debian bookworm/main amd64 gpg-wks-client amd64 2.2.40-1.1 [541 kB]
Get:12 http://deb.debian.org/debian bookworm/main amd64 gpg-wks-server amd64 2.2.40-1.1 [531 kB]
Get:13 http://deb.debian.org/debian bookworm/main amd64 gpgsm amd64 2.2.40-1.1 [671 kB]
Get:14 http://deb.debian.org/debian bookworm/main amd64 gnupg all 2.2.40-1.1 [846 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 7881 kB in 0s (85.0 MB/s)
Selecting previously unselected package libassuan0:amd64.
(Reading database ... 9629 files and directories currently installed.)
Preparing to unpack .../00-libassuan0_2.5.5-5_amd64.deb ...
Unpacking libassuan0:amd64 (2.5.5-5) ...
Selecting previously unselected package gpgconf.
Preparing to unpack .../01-gpgconf_2.2.40-1.1_amd64.deb ...
Unpacking gpgconf (2.2.40-1.1) ...
Selecting previously unselected package libksba8:amd64.
Preparing to unpack .../02-libksba8_1.6.3-2_amd64.deb ...
Unpacking libksba8:amd64 (1.6.3-2) ...
Selecting previously unselected package libnpth0:amd64.
Preparing to unpack .../03-libnpth0_1.6-3_amd64.deb ...
Unpacking libnpth0:amd64 (1.6-3) ...
Selecting previously unselected package dirmngr.
Preparing to unpack .../04-dirmngr_2.2.40-1.1_amd64.deb ...
Unpacking dirmngr (2.2.40-1.1) ...
Selecting previously unselected package gnupg-l10n.
Preparing to unpack .../05-gnupg-l10n_2.2.40-1.1_all.deb ...
Unpacking gnupg-l10n (2.2.40-1.1) ...
Selecting previously unselected package gnupg-utils.
Preparing to unpack .../06-gnupg-utils_2.2.40-1.1_amd64.deb ...
Unpacking gnupg-utils (2.2.40-1.1) ...
Selecting previously unselected package gpg.
Preparing to unpack .../07-gpg_2.2.40-1.1_amd64.deb ...
Unpacking gpg (2.2.40-1.1) ...
Selecting previously unselected package pinentry-curses.
Preparing to unpack .../08-pinentry-curses_1.2.1-1_amd64.deb ...
Unpacking pinentry-curses (1.2.1-1) ...
Selecting previously unselected package gpg-agent.
Preparing to unpack .../09-gpg-agent_2.2.40-1.1_amd64.deb ...
Unpacking gpg-agent (2.2.40-1.1) ...
Selecting previously unselected package gpg-wks-client.
Preparing to unpack .../10-gpg-wks-client_2.2.40-1.1_amd64.deb ...
Unpacking gpg-wks-client (2.2.40-1.1) ...
Selecting previously unselected package gpg-wks-server.
Preparing to unpack .../11-gpg-wks-server_2.2.40-1.1_amd64.deb ...
Unpacking gpg-wks-server (2.2.40-1.1) ...
Selecting previously unselected package gpgsm.
Preparing to unpack .../12-gpgsm_2.2.40-1.1_amd64.deb ...
Unpacking gpgsm (2.2.40-1.1) ...
Selecting previously unselected package gnupg.
Preparing to unpack .../13-gnupg_2.2.40-1.1_all.deb ...
Unpacking gnupg (2.2.40-1.1) ...
Setting up libksba8:amd64 (1.6.3-2) ...
Setting up libnpth0:amd64 (1.6-3) ...
Setting up libassuan0:amd64 (2.5.5-5) ...
Setting up gnupg-l10n (2.2.40-1.1) ...
Setting up gpgconf (2.2.40-1.1) ...
Setting up gpg (2.2.40-1.1) ...
Setting up gnupg-utils (2.2.40-1.1) ...
Setting up pinentry-curses (1.2.1-1) ...
Setting up gpg-agent (2.2.40-1.1) ...
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-browser.socket → /usr/lib/systemd/user/gpg-agent-browser.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-extra.socket → /usr/lib/systemd/user/gpg-agent-extra.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-ssh.socket → /usr/lib/systemd/user/gpg-agent-ssh.socket.
Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent.socket → /usr/lib/systemd/user/gpg-agent.socket.
Setting up gpgsm (2.2.40-1.1) ...
Setting up dirmngr (2.2.40-1.1) ...
Created symlink /etc/systemd/user/sockets.target.wants/dirmngr.socket → /usr/lib/systemd/user/dirmngr.socket.
Setting up gpg-wks-server (2.2.40-1.1) ...
Setting up gpg-wks-client (2.2.40-1.1) ...
Setting up gnupg (2.2.40-1.1) ...
Processing triggers for libc-bin (2.36-9+deb12u9) ...
deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /
#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/experimental/deb/$(ARCH) /
Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm-updates InRelease
Hit:3 http://deb.debian.org/debian-security bookworm-security InRelease
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  InRelease [1477 B]
Get:5 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  Packages [17.7 kB]
Fetched 19.1 kB in 0s (67.8 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
The following NEW packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit
  nvidia-container-toolkit-base
0 upgraded, 4 newly installed, 0 to remove and 12 not upgraded.
Need to get 5843 kB of archives.
After this operation, 27.9 MB of additional disk space will be used.
Get:1 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container1 1.17.5-1 [925 kB]
Get:2 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container-tools 1.17.5-1 [20.2 kB]
Get:3 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit-base 1.17.5-1 [3709 kB]
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit 1.17.5-1 [1189 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 5843 kB in 0s (37.3 MB/s)
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 9884 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.17.5-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.17.5-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.17.5-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.17.5-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.17.5-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.17.5-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.17.5-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.17.5-1) ...
Setting up nvidia-container-toolkit-base (1.17.5-1) ...
Setting up libnvidia-container1:amd64 (1.17.5-1) ...
Setting up libnvidia-container-tools (1.17.5-1) ...
Setting up nvidia-container-toolkit (1.17.5-1) ...
Processing triggers for libc-bin (2.36-9+deb12u9) ...
time="2025-03-20T20:45:10Z" level=info msg="Using config version 3"
time="2025-03-20T20:45:10Z" level=info msg="Using CRI runtime plugin name \"io.containerd.cri.v1.runtime\""
time="2025-03-20T20:45:10Z" level=info msg="Wrote updated config to /etc/containerd/config.toml"
time="2025-03-20T20:45:10Z" level=info msg="It is recommended that containerd daemon be restarted."
umount: /proc/driver/nvidia: not mounted
F0320 20:45:10.706458  156354 main.go:45] Error: patching /proc/driver/nvidia on node 'nvkind-4xqdp-worker': running script on nvkind-4xqdp-worker: executing command: exit status 1


⚡ main ~/nvkind nvidia-smi
Thu Mar 20 20:45:20 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.230.02             Driver Version: 535.230.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:1E.0 Off |                    0 |
| N/A   29C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

⚡ main ~/nvkind uname -a
Linux ip-xxx xxx-aws #84~20.04.1-Ubuntu SMP Mon Jan 20 22:14:54 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
⚡ main ~/nvkind nvidia-smi -L
GPU 0: Tesla T4 (UUID: GPU-ff4d3463-692e-0415-3f0d-3747fc4e01fa)
⚡ main ~/nvkind docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all ubuntu:20.04 nvidia-smi -L
Unable to find image 'ubuntu:20.04' locally
20.04: Pulling from library/ubuntu
d9802f032d67: Pull complete 
Digest: sha256:8e5c4f0285ecbb4ead070431d29b576a530d3166df73ec44affc1cd27555141b
Status: Downloaded newer image for ubuntu:20.04
GPU 0: Tesla T4 (UUID: GPU-ff4d3463-692e-0415-3f0d-3747fc4e01fa)

also have a follow up question about how can i add the extraMount and role control-plane to create this cluster? the worker should be use 1 gpu according to above:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
    extraMounts:
    - hostPath: /<path>/.cache/huggingface
      containerPath: /root/.cache/huggingface

cc @ArangoGutierrez @klueska

@geraldstanje geraldstanje changed the title error with 1 gpu error with sample config Mar 20, 2025
@geraldstanje geraldstanje changed the title error with sample config error with quickstart Mar 20, 2025
@geraldstanje
6F5D Copy link
Author
geraldstanje commented Mar 21, 2025

we can close this, i forgot:

sudo nvidia-ctk runtime configure --runtime=docker --set-as-default --cdi.enabled
sudo nvidia-ctk config --set accept-nvidia-visible-devices-as-volume-mounts=true --in-place
sudo systemctl restart docker

but now see same issue as in #31 - could you check this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0