My Horizontal Pod Autoscaler wasn’t scaling as expected. Here’s how I debugged it.
My debugging steps Link to heading
When HPA isn’t scaling:
kubectl get hpa- Check current/target metrics and replica countkubectl describe hpa my-app- Look at Conditions and Events sectionskubectl top pods- Verify metrics-server is working- Check if pods have resource requests defined (HPA needs them)
The most common issue I see: pods without CPU/memory requests. HPA can’t calculate utilisation percentage without knowing the baseline.
Basic commands Link to heading
Check HPA status:
kubectl get hpa
Describe for more detail:
kubectl describe hpa my-app
Look for the Conditions section - it tells you why scaling isn’t happening.
Check all HPAs across namespaces:
kubectl get hpa --all-namespaces
Common issues Link to heading
- Metrics not available - Check metrics-server is running
- Target not found - Deployment name mismatch
- Min = Max - Can’t scale if they’re equal
- No resource requests - HPA can’t calculate percentage without them
- Cooldown period - HPA waits before scaling (default 5 mins for scale-down)
Check metrics-server:
kubectl get pods -n kube-system | grep metrics
View current metrics:
kubectl top pods
For monitoring commands, see monitoring with watch and top.
Metrics I use for scaling Link to heading
- CPU - Good for compute-bound workloads. Target 50-70% utilisation.
- Memory - Less useful for scaling (memory doesn’t release as quickly). Better for alerting.
- Custom metrics - Queue depth, request latency, connections. More accurate for I/O-bound services.
For web services, CPU at 70% target typically works well. For queue workers, queue depth via custom metrics is more accurate.
Nuclear option Link to heading
If HPA is stuck, sometimes deleting and recreating helps:
kubectl delete hpa my-app
kubectl apply -f hpa.yaml
Further reading Link to heading
- HPA algorithm details - how scaling decisions are made