Merge pull request #9 from svc-design/codex/fix-missing-nvidia-container-runtime-package

Fix GPU role packages
This commit is contained in:
shenlan 2025-06-25 20:28:42 +08:00 committed by GitHub
commit 0e47349cc9
3 changed files with 13 additions and 9 deletions

View File

@ -7,7 +7,7 @@ This document describes how to use the `gpu-k8s` role to deploy a simple Kuberne
The role performs three main tasks:
1. **Create the Kubernetes cluster** using [sealos](https://github.com/labring/sealos). It runs the provided `sealos run` command to bootstrap the master and worker nodes.
2. **Install NVIDIA drivers and container runtime** on the target hosts so that Kubernetes can access GPU resources.
2. **Install NVIDIA drivers and the NVIDIA container toolkit** on the target hosts so that Kubernetes can access GPU resources.
3. **Verify GPU access** by deploying the official NVIDIA device plugin and running a small CUDA workload.

View File

@ -1,9 +1,11 @@
- name: Add NVIDIA repository
- name: Add NVIDIA repositories
shell: |
add-apt-repository -y ppa:graphics-drivers
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | tee /etc/apt/sources.list.d/nvidia-container-runtime.list
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | tee /etc/apt/sources.list.d/nvidia-container-runtime.list
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list
apt-get update
- name: Install NVIDIA driver and container runtime
@ -12,6 +14,6 @@
- nvidia-modprobe
- nvidia-driver-535
- nvidia-headless-535
- nvidia-container-runtime
- nvidia-container-toolkit
state: present
update_cache: yes

View File

@ -1,9 +1,11 @@
- name: Add NVIDIA repository
- name: Add NVIDIA repositories
shell: |
add-apt-repository -y ppa:graphics-drivers
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | tee /etc/apt/sources.list.d/nvidia-container-runtime.list
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | tee /etc/apt/sources.list.d/nvidia-container-runtime.list
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list
apt-get update
args:
executable: /bin/bash
@ -15,7 +17,7 @@
- nvidia-modprobe
- nvidia-driver-535
- nvidia-headless-535
- nvidia-container-runtime
- nvidia-container-toolkit
state: present
update_cache: yes
become: true