Troubleshooting Pod Issues
ImagePullBackOff
If you encounter an ImagePullBackOff error with one or more pods, the image may no longer be available upstream (i.e. no longer on DockerHub) or within the specified private repository.
It may be possible to still leverage this image if it is present one or more nodes, indicated by running pods for the same StatefulSet
or Deployment
. For example:
$ kubectl get pods -n <namespace> -o wide -l app=minio
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
minio-0 0/1 ImagePullBackOff 0 141m 10.3.70.76 10.2.99.39 <none> <none>
minio-1 1/1 Running 0 146m 10.3.4.4 10.2.99.42 <none> <none>
minio-2 1/1 Running 0 152m 10.3.92.8 10.2.99.43 <none> <none>
minio-3 1/1 Running 0 148m 10.3.17.4 10.2.99.44 <none> <none>
Save/Load Docker Image from Cache
- Identify the image – either by name, or image ID, which can be found in the
kubectl describe pod
orkubectl get pod
output – for anyRunning
pod.
$ kubectl get pod minio-1 -n <namespace> -o json | jq '.spec.containers[].image'
"minio/minio:RELEASE.2021-05-27T22-06-31Z"
- SSH to the
Node
on which the targetedRunning
pod resides, and list the Docker images.
$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
minio/minio RELEASE.2021-05-27T22-06-31Z 375bd7b2cc2c 4 weeks ago 264MB
- Save the Docker image – specifying the
Image ID
,Repository
, andTag
.
$ sudo docker save <image-id> -o <filename.tar <repository>:<tag>
Note: If the <repository>:<tag>
is not specified, the Docker image will be imported without any identifying information and Kubernetes will continue to fail to spawn the pod as a result.
- Transfer the file (i.e. using
rsync
orscp
) to the destinationNode
where the pod is inImagePullBackOff
state. - Load the Docker image on that
Node
.
$ sudo docker load -i <filename>.tar
Example
$ sudo docker load -i <filename>.tar
Loaded image ID: sha256:375bd7b2cc2ce76d6216ae68f68b54913a8f2df4a3d2c34919bd33d23a02f409
CrashLoopBackOff
NodeAffinity
$ kubectl get pods -A | grep -i sentry
platform9-system pf9-sentry-688cfcf9-h52v6 1/1 Running 1 3h22m
*platform9-system pf9-sentry-688cfcf9-s4fjm 0/1 NodeAffinity 0 3h33m*
---- ------ ---- ---- -------
Warning FailedScheduling 11m default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Warning FailedScheduling 11m default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Normal Scheduled 11m default-scheduler Successfully assigned platform9-system/pf9-sentry-688cfcf9-s4fjm to 10.128.230.29
Normal Pulled 11m kubelet Container image "platform9/pf9-sentry:1.0.0" already present on machine
Normal Created 11m kubelet Created container pf9-sentry
Normal Started 11m kubelet Started container pf9-sentry
Warning Unhealthy 6m29s kubelet Liveness probe failed: Get http://10.20.244.65:8080/v1/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning NodeAffinity 87s kubelet Predicate NodeAffinity failed
You may delete the pod stuck in node affinity status as another pod is spawned for the same application.
Was this page helpful?