Pod "etcd-backup-with-interval-" in "NotReady" State

Problem

One or more etcd-backup-with-interval- pods in the kube-system namespace are in a NotReady state, e.g.

Bash
Copy

The pod is associated with a job which is showing 0/1 completions, e.g.

Bash
Copy

The pod log shows only that it has created a temporary DB file with no further output, e.g.

Bash
Copy

A kubectl describe job shows that the State is Running .

Bash
Copy

Environment

  • Platform9 Managed Kubernetes – v5.7 and Higher

Cause

The etcdctl snapshot save command is "hanging" or failing to complete as it is missing the following Environment section/variables which control the flags to be passed to the etcdctl command-line utility which are necessary for TLS authentication.

Bash
Copy

Thus, the job never transitions to Succeeded or Failed and the pod will continue to be re-created.

Resolution

  1. Delete the job which is associated with the pod .
  2. The pod will be terminated, and a new pod will be re-created associated with a new job resource which is using an updated spec template.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard