Getting resource recommendations from VPA

3 min read

Setting appropriate resource requirements in Pods is one of the most critical tasks of a developer deploying in Kubernetes. In section 5.2 of my book, I discuss how you can use kubectl top pods to analyze and set resource values. If you’re fortunate enough to use GKE where the VPA component is offered out of the box, there is another option for analyzing usage which makes this even easier and more accurate: running VPA in advisory mode.

Vertical Pod Autoscaler is a useful tool that can automatically scale your containers requirements vertically (i.e. increasing/decreasing cpu and memory requests). One really neat feature is that it can also be used in advisory mode to inform you of what resources your containers may need! In this way, it runs all the same analysis to understand what resources your workload needs without making any live modifiications.

Here’s an an example VPA config configured into advisory mode:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: timeserver
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment
    name:       timeserver
  updatePolicy:
    updateMode: "Off"

vpa-advisor.yaml

Note how there’s a couple of differences to the VPA I demo’d before. In particular, updatePolicy is Off. This is important to prevent the VPA from changing your resources (unless you want that). Also since all I’m looking for here is advice, I have not set the minimum or maximum resources which I might do when using VPA in active mode. That’s because here, I just want to get VPA’s opinion, regardless of my values.

If you have multiple deployments, for convenience you can stack these “advisory” VPAs into a single file, with each separated by ---. Note that VPA only works with higher-order workload constructs like Deployment, StatefulSet and the like.

Here’s a complete demo with a Deployment, Service, a Job to generate load on the Service (for more interesting data), and the above VPA in advisory mode.

kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/deploy.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/svc.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/load-job.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/vpa-advisor.yaml

After deploying, give it a few minutes and then you can query to get the status.

$ kubectl get vpa
NAME         MODE   CPU   MEM        PROVIDED   AGE
timeserver   Off    65m   23068672   True       106s

You can also describe to get more information.

$ kubectl describe vpa
Name:         timeserver
Namespace:    demo
Labels:       <none>
Annotations:  <none>
API Version:  autoscaling.k8s.io/v1
Kind:         VerticalPodAutoscaler
Metadata:
  Creation Timestamp:  2024-11-05T22:26:59Z
  Generation:          3
  Resource Version:    60326435
  UID:                 72f1c657-ac42-4148-80ae-451cd2b21b6e
Spec:
  Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         timeserver
  Update Policy:
    Update Mode:  Off
Status:
  Conditions:
    Last Transition Time:  2024-11-05T22:28:24Z
    Message:               Some containers have a small number of samples
    Reason:                timeserver-container
    Status:                True
    Type:                  LowConfidence
    Last Transition Time:  2024-11-05T22:28:24Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  timeserver-container
      Lower Bound:
        Cpu:     8m
        Memory:  4194304
      Target:
        Cpu:     65m
        Memory:  23068672
      Uncapped Target:
        Cpu:     65m
        Memory:  23068672
      Upper Bound:
        Cpu:     94365m
        Memory:  32488030208
Events:          <none>

One thing to note if you see this status:

Status:
  Conditions:
    Last Transition Time:  2024-11-05T22:16:01Z
    Status:                False
    Type:                  LowConfidence

Is that this actually should be read as “low confidence = false”, i.e. it’s not a low confidence result. I read that wrong the first time.

Now that you have these values, you have another option to kubectl top. You can review these values and update your Pods accordingly. For this example here, the key information are these results:

    Cpu:     65m
    Memory:  23068672

I might use this to update the Podspec and set my CPU and memory resource requests accordingly. Here’s my updated Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: timeserver
spec:
  replicas: 5
  selector:
    matchLabels:
      pod: timeserver-pod
  template:
    metadata:
      labels:
        pod: timeserver-pod
    spec:
      containers:
      - name: timeserver-container
        image: docker.io/wdenniss/timeserver:4
        resources:
          requests:
            cpu: 65m
            memory: 23068672
            ephemeral-storage: 5Gi

Further reading