Why VPA ignores single-replica pods

TLDR: VPA’s updater defaults to requiring 2 replicas before it will evict a pod. Single-replica deployments are silently excluded from auto-healing — even if they’re crashlooping. You can override this with minReplicas: 1 in the VPA spec, no cluster upgrade needed.

I had a pod stuck in CrashLoopBackOff for 20 hours with 192 restarts. VPA had pushed its memory limit down to ~157Mi — too low for a Python service — and the pod was OOMing on every start.

The confusing part: VPA knew the pod needed more memory. Its target recommendation was ~301Mi. But it wouldn’t evict the pod to apply it.

The `minReplicas` default Link to heading

VPA’s updater component has a --min-replicas flag that defaults to 2. If a deployment has fewer replicas than this, VPA will never evict its pods — the logic being that evicting the only replica would cause downtime.

The irony is that a crashlooping pod is already down. VPA’s “protection” just guarantees the problem persists. The pod restarts in place (kubelet handles container restarts), VPA only applies new resources on pod creation (via the admission webhook), and VPA won’t trigger a creation because it won’t evict.

I initially thought I needed to upgrade to GKE 1.35.2 where this default changes to 1. I went down a rabbit hole upgrading clusters, hit a stuck emulated version that blocked all upgrades, and eventually found the fix was much simpler.

The fix Link to heading

You can override minReplicas per VPA object. This has been available since Kubernetes 1.22 — no cluster upgrade needed:

spec:
  updatePolicy:
    updateMode: Auto
    minReplicas: 1

For self-managed VPA, you can also set the global default with the --min-replicas flag on the updater deployment.

Preventing the OOM in the first place Link to heading

VPA only pushed memory that low because it had no floor. I added minAllowed to set a baseline per runtime:

spec:
  resourcePolicy:
    containerPolicies:
      - containerName: my-service
        minAllowed:
          memory: 256Mi

I landed on 256Mi for Python, 128Mi for Node.js, and 64Mi for Go — roughly the minimum each runtime needs to start up and handle a request. VPA can still recommend higher, but it can’t go below the floor.

Bonus: GKE’s emulated version trap Link to heading

While trying to upgrade to 1.35.2, I hit a separate issue worth mentioning. GKE uses a two-step upgrade for minor versions — it upgrades the binary first but keeps the API at the old version (the “emulated version”) for a soak period. One of my clusters had been soaking for over a week and hadn’t auto-finalised, blocking all upgrades:

Cannot do minor upgrade if master has different master version and
emulated version: current master version: "1.34.5-gke.1153000",
current emulated version: "1.33"

Nothing in gcloud container clusters upgrade --help could fix it. The command I needed was buried in the upgrade docs:

gcloud beta container clusters complete-control-plane-upgrade CLUSTER_NAME \
  --location=REGION

This forces the emulated version to match the binary, unblocking everything.

The minReplicas default Link to heading

The fix Link to heading

Preventing the OOM in the first place Link to heading

Bonus: GKE’s emulated version trap Link to heading

Further reading Link to heading

The `minReplicas` default Link to heading