Stable Diffusion WebUI on GKE Autopilot

I recently set out to run Stable Diffusion on GKE in Autopilot mode, building a container from scratch using the AUTOMATIC1111‘s webui. This is likely not how you’d host a stable diffusion service for production (which would make for a good topic of another blog post), but it’s a fun way to try out the… Continue reading Stable Diffusion WebUI on GKE Autopilot

CUDA 12 on GKE Autopilot

Per the NVIDIA docs, CUDA 12 applications require driver 525.60.04+. This driver is available as part of GKE 1.28. To upgrade an existing cluster to the latest version of 1.28: This upgrades the control plane, and schedules the nodes to follow, which generally completes within a day or two (depending on how many nodes you… Continue reading CUDA 12 on GKE Autopilot

Finding the NVIDIA Driver Version on GKE

Update: this information is now available in the official docs. If you want to know what version of your GPU drivers are active on GKE, here’s a one-liner: What this command does is get all the logs of Pods with the label k8s-app=nvidia-gpu-device-plugin (there are several different DaemonSets that can install the drivers depending on… Continue reading Finding the NVIDIA Driver Version on GKE