Running GPU workloads on K3S

2 minute read

A team I work with is very happy with the k8s / ArgoCD setup we set up, and now wants to manage their experimental ML workloads in k8s as well. These workloads run in Lambda Labs, who we use to quickly set up a few GPU-enabled Ubuntu machines in varying sizes. Those servers are not part of a managed k8s cluster, but instead you’re given SSH and Jupyter access and expected to deploy your workload yourself.

Onboarding them to k8s seemed like a good use for k3s, a batteries-included curl-to-bash-installation of k8s. Getting a k8s node running and a cpu-based pod on it was pretty trivial. However, I couldn’t find a consistent source on how to run GPU loads. After cobbling it together from several places and getting it to work, I’m happy to share my particular rain-dance for Nvidia GPUs on k3s:

1. Install k3s

curl -sfL https://get.k3s.io | sh -

Obviously modify with additional parameters if needed. My actual config is:

MY_IP=$(curl ifconfig.me)
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--node-external-ip=$MY_IP --flannel-backend=wireguard-native --flannel-external-ip --tls-san $MY_IP" sh -

2. Install Nvidia’s GPU operator

This will handle detecting GPUs on the node, installing relevant drivers and runtimes, and reporting the GPUs their metadata back to the cluster for scheduling. Done by using the HelmChart CRD included with k3s

cat <<EOF | sudo kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: gpu-operator
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: nvidiagpu
  namespace: gpu-operator
spec:
  chart: gpu-operator
  repo: https://helm.ngc.nvidia.com/nvidia
EOF

Give it a minute, and then verify:

sudo kubectl get node -o json | jq '.items[0].status.capacity'
{
"cpu": "30",
  "ephemeral-storage": "1422559648Ki",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "memory": "232949228Ki",
  "nvidia.com/gpu": "1",
  "pods": "110"
}

The important part is that the node has nvidia.com/gpu resources detected.

3. Test pod

Create a test pod that requires a GPU

cat <<EOF | sudo kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu
spec:
  runtimeClassName: nvidia
  restartPolicy: Never
  nodeSelector:
    nvidia.com/gpu.product: "NVIDIA-A100-SXM4-80GB"
  containers:
    - name: gpu
      image: "ubuntu"
      command: [ "/bin/bash", "-c", "--" ]
      args: [ "while true; do sleep 30; done;" ]
      resources:
        limits:
          nvidia.com/gpu: 2
EOF

Things to note/configure in the above template:

  • runtimeClassName: Required to expose Nvidia stuff to the pod
  • nodeSelector > nvidia.com/gpu.product: Asking for a specific GPU (in case you have a hetrogenous environment). Collect the model’s string from kubectl describe node. Delete if uneeded
  • containers > resources > limits > nvidia.com/gpu: How many GPUs the pod needs (affects scheduling and actual access)

After the pod is submitted and running (kubectl get pod), you can run some CLI tools on it and ensure the result looks nice, e.g.:

kubectl exec -it gpu -- nvidia-smi

4. Further configuration

Once that’s done, you can remove the test pod (kubectl delete pod gpu) and continue configuring your cluster - connecting to ArgoCD, deploying actual workloads, etc

5. Further reading