My Horizontal Pod Autoscaler wasn’t scaling as expected. Here’s how I debugged it.

My debugging steps Link to heading

When HPA isn’t scaling:

  1. kubectl get hpa - Check current/target metrics and replica count
  2. kubectl describe hpa my-app - Look at Conditions and Events sections
  3. kubectl top pods - Verify metrics-server is working
  4. Check if pods have resource requests defined (HPA needs them)

The most common issue I see: pods without CPU/memory requests. HPA can’t calculate utilisation percentage without knowing the baseline.

Basic commands Link to heading

Check HPA status:

kubectl get hpa

Describe for more detail:

kubectl describe hpa my-app

Look for the Conditions section - it tells you why scaling isn’t happening.

Check all HPAs across namespaces:

kubectl get hpa --all-namespaces

Common issues Link to heading

  1. Metrics not available - Check metrics-server is running
  2. Target not found - Deployment name mismatch
  3. Min = Max - Can’t scale if they’re equal
  4. No resource requests - HPA can’t calculate percentage without them
  5. Cooldown period - HPA waits before scaling (default 5 mins for scale-down)

Check metrics-server:

kubectl get pods -n kube-system | grep metrics

View current metrics:

kubectl top pods

For monitoring commands, see monitoring with watch and top.

Metrics I use for scaling Link to heading

  • CPU - Good for compute-bound workloads. Target 50-70% utilisation.
  • Memory - Less useful for scaling (memory doesn’t release as quickly). Better for alerting.
  • Custom metrics - Queue depth, request latency, connections. More accurate for I/O-bound services.

For web services, CPU at 70% target typically works well. For queue workers, queue depth via custom metrics is more accurate.

Nuclear option Link to heading

If HPA is stuck, sometimes deleting and recreating helps:

kubectl delete hpa my-app
kubectl apply -f hpa.yaml

Further reading Link to heading