autopilot

Using Node-based pricing on GKE Autopilot

New this year, Autopilot now has two pricing models: the original Pod-based model, and the new node-based option. The pricing page does a pretty good job of explaining the difference (at least I hope it explains it well, as I wrote it), and how best to utilize each option, but here’s a quick recap anyway:… Continue reading Using Node-based pricing on GKE Autopilot

Serving Stable Diffusion with RayServe on GKE Autopilot

Now that you’ve tried out Stable Diffusion on GKE Autopilot via the WebUI, you might be wondering how you’d go about adding stable diffusion as a proper micro-service that other components of your application can call. One popular way is via Ray. Let’s try this tutorial: Serve a StableDiffusion text-to-image model on Kubernetes, on GKE… Continue reading Serving Stable Diffusion with RayServe on GKE Autopilot

GKE Autopilot: how to know if Pending pods will be scheduled

GKE Autopilot is pretty magical. You create a cluster just by picking a region and giving it a name, schedule Kubernetes workloads and the compute resources are provisioned automatically. While Kubernetes is provisioning resources, your Pods will be in the Pending state. This is all well and good, except… there are other reasons that your… Continue reading GKE Autopilot: how to know if Pending pods will be scheduled

Provisioning spare capacity in GKE Autopilot with placeholder balloon pods

Autopilot is a new mode of operation for Google Kubernetes Engine (GKE) where compute capacity is dynamically provisioned based on your pod’s requirements. Among other innovations, it essentially functions as a fully automatic cluster autoscaler. Update: GKE now has an official guide for provisioning spare capacity. When you deploy a new pod in this environment,… Continue reading Provisioning spare capacity in GKE Autopilot with placeholder balloon pods