Setting appropriate resource requirements in Pods is one of the most critical tasks of a developer deploying in Kubernetes. In section 5.2 of my book, I discuss how you can use kubectl top pods
to analyze and set resource values. If you’re fortunate enough to use GKE where the VPA component is offered out of the box, there is another option for analyzing usage which makes this even easier and more accurate: running VPA in advisory mode.
Vertical Pod Autoscaler is a useful tool that can automatically scale your containers requirements vertically (i.e. increasing/decreasing cpu and memory requests). One really neat feature is that it can also be used in advisory mode to inform you of what resources your containers may need! In this way, it runs all the same analysis to understand what resources your workload needs without making any live modifiications.
Here’s an an example VPA config configured into advisory mode:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: timeserver
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: timeserver
updatePolicy:
updateMode: "Off"
Note how there’s a couple of differences to the VPA I demo’d before. In particular, updatePolicy
is Off
. This is important to prevent the VPA from changing your resources (unless you want that). Also since all I’m looking for here is advice, I have not set the minimum or maximum resources which I might do when using VPA in active mode. That’s because here, I just want to get VPA’s opinion, regardless of my values.
If you have multiple deployments, for convenience you can stack these “advisory” VPAs into a single file, with each separated by ---
. Note that VPA only works with higher-order workload constructs like Deployment, StatefulSet and the like.
Here’s a complete demo with a Deployment, Service, a Job to generate load on the Service (for more interesting data), and the above VPA in advisory mode.
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/deploy.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/svc.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/load-job.yaml
kubectl create -f https://raw.githubusercontent.com/WilliamDenniss/kubernetes-for-developers/master/Bonus/vpa/vpa-advisor.yaml
After deploying, give it a few minutes and then you can query to get the status.
$ kubectl get vpa
NAME MODE CPU MEM PROVIDED AGE
timeserver Off 65m 23068672 True 106s
You can also describe to get more information.
$ kubectl describe vpa
Name: timeserver
Namespace: demo
Labels: <none>
Annotations: <none>
API Version: autoscaling.k8s.io/v1
Kind: VerticalPodAutoscaler
Metadata:
Creation Timestamp: 2024-11-05T22:26:59Z
Generation: 3
Resource Version: 60326435
UID: 72f1c657-ac42-4148-80ae-451cd2b21b6e
Spec:
Target Ref:
API Version: apps/v1
Kind: Deployment
Name: timeserver
Update Policy:
Update Mode: Off
Status:
Conditions:
Last Transition Time: 2024-11-05T22:28:24Z
Message: Some containers have a small number of samples
Reason: timeserver-container
Status: True
Type: LowConfidence
Last Transition Time: 2024-11-05T22:28:24Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: timeserver-container
Lower Bound:
Cpu: 8m
Memory: 4194304
Target:
Cpu: 65m
Memory: 23068672
Uncapped Target:
Cpu: 65m
Memory: 23068672
Upper Bound:
Cpu: 94365m
Memory: 32488030208
Events: <none>
One thing to note if you see this status:
Status:
Conditions:
Last Transition Time: 2024-11-05T22:16:01Z
Status: False
Type: LowConfidence
Is that this actually should be read as “low confidence = false”, i.e. it’s not a low confidence result. I read that wrong the first time.
Now that you have these values, you have another option to kubectl top
. You can review these values and update your Pods accordingly. For this example here, the key information are these results:
Cpu: 65m
Memory: 23068672
I might use this to update the Podspec and set my CPU and memory resource requests accordingly. Here’s my updated Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: timeserver
spec:
replicas: 5
selector:
matchLabels:
pod: timeserver-pod
template:
metadata:
labels:
pod: timeserver-pod
spec:
containers:
- name: timeserver-container
image: docker.io/wdenniss/timeserver:4
resources:
requests:
cpu: 65m
memory: 23068672
ephemeral-storage: 5Gi
Further reading
This article is bonus material to supplement my book Kubernetes for Developers.