DeepSeek’s R1 open model launch caused quite a stir with one of the first open reasoning models. Here’s how to run a demo of it locally on GKE! We can use an Nvidia L4 (or A100 40GB) to run the 8B Llama distilled model, or a A100 80GB to run the 14 and 32B Quen… Continue reading Running DeepSeek open reasoning models on GKE
Tag: ai
Serving Stable Diffusion with RayServe on GKE Autopilot
Now that you’ve tried out Stable Diffusion on GKE Autopilot via the WebUI, you might be wondering how you’d go about adding stable diffusion as a proper micro-service that other components of your application can call. One popular way is via Ray. Let’s try this tutorial: Serve a StableDiffusion text-to-image model on Kubernetes, on GKE… Continue reading Serving Stable Diffusion with RayServe on GKE Autopilot
Stable Diffusion WebUI on GKE Autopilot
I recently set out to run Stable Diffusion on GKE in Autopilot mode, building a container from scratch using the AUTOMATIC1111‘s webui. This is likely not how you’d host a stable diffusion service for production (which would make for a good topic of another blog post), but it’s a fun way to try out the… Continue reading Stable Diffusion WebUI on GKE Autopilot