I was checking the GKE cluster autoscaler configuration. Here’s how.
How long does scale-up take? Link to heading
In my experience with GKE:
- Scale-up: 2-5 minutes from pending pod to running
- Scale-down: 10+ minutes (configurable, conservative by default)
The scale-up time depends on node pool configuration. Preemptible/spot nodes can be slightly faster. If you need faster scale-up, consider keeping a small buffer of spare capacity.
View configuration Link to heading
View autoscaling config:
gcloud container clusters describe my-cluster \
--region=europe-north1 \
--format="yaml(autoscaling)"
View node pool autoprovisioning defaults:
gcloud container clusters describe my-cluster \
--region=europe-north1 \
--format="yaml(autoscaling.autoprovisioningNodePoolDefaults)"
Check autoscaler status in the cluster:
kubectl get cm/cluster-autoscaler-status -n kube-system -o yaml
View node allocatable resources:
kubectl describe nodes | grep -A5 "Allocatable"
Check scaling activity Link to heading
Check which nodes can be scaled down:
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
Nodes with ToBeDeletedByClusterAutoscaler taint are being removed.
Cost considerations Link to heading
- Min nodes too high - Paying for idle capacity
- Min nodes too low - Cold starts during traffic spikes
- Scale-down too aggressive - Nodes churning up and down
- Scale-down too conservative - Paying for unused nodes
I typically set min to handle baseline traffic, max to handle peak + 20%, and leave scale-down delay at the default (10 minutes). For cost savings, use spot/preemptible nodes for workloads that can handle interruption.
Monitor scaling with watch and top.
Further reading Link to heading
- GKE autoscaler best practices - Google’s recommendations